wordwrap like regex

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

wordwrap like regex

Post by s.dot »

I would like to do a wordwrap() like function for display of comments to break up somebody putting in a really long word like "loooooooool" (except way longer). The only problem is I don't want to break up the links in the comments at the same time.. so:

Code: Select all

$comment = wordwrap($comment, 75, '<br />', true);
will not work.

I need a regex that will only replace character sequences (white space not included) >=75 in length NOT inside of <a href="here"> or <a href="#">here</a>

I am terrible at regexs.

My start would be replacing any character sequence without whitespace >= 75

Code: Select all

$pieces = preg_split('/[^\s]{75,}/', $comment);
Help?
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
ridgerunner
Forum Contributor
Posts: 214
Joined: Sun Jul 05, 2009 10:39 pm
Location: SLC, UT

Re: wordwrap like regex

Post by ridgerunner »

Interesting challenge. Here's how I'd do it using a callback function:

Code: Select all

<?php
// regex - match the contents grouping into link and non-link chunks
$re = '%
([^<]++(?:(?!<a\b[^>]*+>.*?</a>)<[^<]*+)*+)  # grab all non <a ...>...</a> text into group 1
|                                            # or...
(<a\b[^>]*+>.*?</a>)                         # grab each whole <a ...>...</a> tag into group 2
%six';
 
$contents = 'BEGIN TEST DATA 
this_is_a_really_long_word_outside_a_link_this_is_a_really_long_word_outside_a_link_this_is_a_really_long_word_outside_a_link_this_is_a_really_long_word_outside_a_link_this_is_a_really_long_word_outside_a_link_this_is_a_really_long_word_outside_a_link
 and <a href="http://example.com">This is a link with 
a_really_long_word_inside_a_link_this_is_a_really_long_word_inside_a_link_this_is_a_really_long_word_inside_a_link_this_is_a_really_long_word_inside_a_link end of test link</a>';
 
// walk through the content, chunk, by chunk, replacing long words inside non-link chunks only
$contents = preg_replace_callback($re, 'callback_func', $contents);
 
function callback_func($matches) { // here's the callback function
    if ($matches[1]) {        // case 1: a non-link chunk. Split up any long "words"
        return preg_replace('/(\S{75})(?=\S)/', "$1<br />\n", $matches[1]);
    } elseif ($matches[2]) {  // case 2: this is a link.
        return $matches[2];   //  Return link unmodified
    }
    exit("Error!");
}
echo ($contents);
?>
Hope this helps! :)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Re: wordwrap like regex

Post by s.dot »

Wow, looks intense. 8O
I will try it out and post back with the results.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Post Reply