PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Tue Dec 10, 2019 12:16 am

All times are UTC - 5 hours




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Excerpt Function
PostPosted: Tue Sep 28, 2010 12:16 pm 
Offline
DevNet Resident
User avatar

Joined: Wed Apr 01, 2009 1:31 pm
Posts: 1532
This function is a response to topic .
Syntax: [ Download ] [ Hide ]
<?php
/**
 * Returns an excerpt from the beginning of a string. An attempt is made to
 * return whole words only. However, if the beginning of the string up to the
 * maximum length consists of all word characters, the string is truncated to
 * the maximum length.
 *
 * @param string $string
 *     The input string. Multi-line strings are supported.
 *
 * @param integer $maxLength
 *     The maximum length of the excerpt, not including the suffix. The actual
 *     length of the returned string may be shorter (to avoid cutting a word) or
 *     longer (because of the suffix).
 *
 * @param integer $minLength
 *     The minimum length of the excerpt. Word boundaries occurring before this
 *     character position are ignored. Defaults to 0.
 *
 * @param string $suffix
 *     What to append to the excerpt if the input string is longer than the
 *     maximum length or if forced. By default, an ellipsis ("...").
 *
 * @param boolean $forceSuffix
 *     If true, the suffix is appended even if the input string is shorter than
 *     the maximum length. False by default.
 *
 * @return string
 *     An excerpt of the input string, or null if pattern matching failed.
 *     Exceptions are thrown if the function arguments are incongruent.
 */

function excerpt ($string, $maxLength, $minLength = 0, $suffix = '...', $forceSuffix = false) {
    if ($maxLength < 0) {
        throw new Exception('Required: $maxLength >= 0');
    }
    if ($minLength < 0) {
        throw new Exception('Required: $minLength >= 0');
    }
    if ($maxLength < $minLength) {
        throw new Exception('Required: $minLength <= $maxLength');
    }
    $strlen = strlen($string);
    if ($strlen <= $maxLength) {
        return $string . ($forceSuffix ? $suffix : '');
    }
    $pattern = sprintf('/\A(.{%1$u,%2$u}(?!\w)|.{0,%2$u})/s', $minLength, $maxLength);
    preg_match($pattern, $string, $matches);
    $excerpt = $matches[1];
    if ($strlen > $maxLength || $forceSuffix) {
        $excerpt .= $suffix;
    }
    return $excerpt;
}

// Usage example
$text = 'The quick brown fox jumped over the lazy dog.';
try {
    var_dump(excerpt($text, 20, 10)); // string(22) "The quick brown fox..."
} catch (Exception $e) {
    echo $e->getMessage();
}

I hope I explained everything well enough in the comments. Suggestions for improvements are welcome. Would adding offset and prefix parameters be excessive? Should I trigger warnings and return null instead of throwing exceptions? Should I consider using mb_strlen()?

Edit: Amended third exception message, added "=".


Last edited by McInfo on Tue Sep 28, 2010 1:52 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Excerpt Function
PostPosted: Tue Sep 28, 2010 12:45 pm 
Offline
DevNet Master
User avatar

Joined: Thu Mar 15, 2007 6:28 pm
Posts: 2765
Location: Redding, California
8O
That's pretty slick. That regex statement is a tough one... I didn't even think of what would happen if the max length is bigger than the size of the input string.

Overall, I think throwing exceptions is better than a more "graceful" failure because it forces the developer to debug.


Top
 Profile  
 
 Post subject: Re: Excerpt Function
PostPosted: Tue Sep 28, 2010 1:41 pm 
Offline
DevNet Resident
User avatar

Joined: Wed Apr 01, 2009 1:31 pm
Posts: 1532
After the sprintf() substitution has taken place, the regular expression is a little less intimidating. Here is an explanation:
Syntax: [ Download ] [ Hide ]
'/\A(.{10,20}(?!\w)|.{0,20})/s' regex string
'                             ' string bounds
 /                          /   regex bounds
                             s  dotall modifier (so dot matches newline)
  \A                            start of subject
    (                      )    subpattern
                   |            "or" branch in subpattern
     .{10,20}(?!\w)             min 10, max 20 of any character not a followed by a word character
     .                          any character (including newline because of s modifier)
      {10,20}                   quantifier, minimum 10, maximum 20
             (?!  )             negative lookahead assertion
                \w              any word character
                    .{0,20}     min 0, max 20 of any character
                    .           any character (including newline because of s modifier)
                     {0,20}     quantifier, minimum 0, maximum 20


Top
 Profile  
 
 Post subject: Re: Excerpt Function
PostPosted: Tue Sep 28, 2010 2:32 pm 
Offline
Site Admin
User avatar

Joined: Tue Dec 23, 2003 3:10 am
Posts: 11470
Location: Toronto
I've always used this regex, written by feyd way back when (currently set to 60 chars).

Syntax: [ Download ] [ Hide ]
#^\s*(.{60,}?)\s+.*$#s


.. which will grab the first 60 chars, and continue until it has found a space (meaning no chopped words).


Top
 Profile  
 
 Post subject: Re: Excerpt Function
PostPosted: Tue Sep 28, 2010 3:07 pm 
Offline
DevNet Master
User avatar

Joined: Thu Mar 15, 2007 6:28 pm
Posts: 2765
Location: Redding, California
The advantage of McInfo's solution is it goes backward instead of forward, so you can be sure it will be shorter than the given number. Than regex is pretty neat though. Perhaps it could be modified to go backward?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group