Page 1 of 1

PHP Regex word boundary pattern ignores / in urls

Posted: Wed Jun 17, 2009 4:24 pm
by SchweppesAle
hi, I'd like to correct the following REGEX pattern so that it no long effects urls or text proceeded by a forward slash /

example, the following preg_replace will take the string

reviews/payroll-relief-from-accountantsworld.html

payroll

and assuming payroll is the needle, it will return

reviews/<a href = "blah">payroll</a>-relief-from-accountantsworld.html <a href = "blah">payroll</a>

how can i keep word boundary from ignoring the / slash when searching for substrings? thanks

Code: Select all

 
$words[$i] = '/\b'.$words[$i].'\b/';
 
$content  = preg_replace($words, $Links, $content);
 

Re: PHP Regex word boundary pattern ignores / in urls

Posted: Wed Jun 17, 2009 9:52 pm
by jgadrow
I believe you'd have to build a recursive subpattern into it to perform this. Basically, you can't use \b because the definition of such is that it checks for a character before or after the current character with a different property than the current character.

So, if the current character is a \w character, it matches on the first \W character it finds. You would need to recursively match on: (\w|\/) so that you're matching any 'word' character plus the forward slash. FOLLOWING a \W character. And the same BEFORE a \W character.

Re: PHP Regex word boundary pattern ignores / in urls

Posted: Thu Jun 18, 2009 12:52 am
by prometheuzz
A negative look behind can be used in this case:

Code: Select all

$text = 'reviews/payroll-relief-from-accountantsworld.html,payroller,payroll';
echo preg_replace('#\b(?<!/)payroll\b#i', 'FOO', $text);