PHP Regex word boundary pattern ignores / in urls

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
SchweppesAle
Forum Newbie
Posts: 9
Joined: Thu Aug 14, 2008 11:40 pm

PHP Regex word boundary pattern ignores / in urls

Post by SchweppesAle »

hi, I'd like to correct the following REGEX pattern so that it no long effects urls or text proceeded by a forward slash /

example, the following preg_replace will take the string

reviews/payroll-relief-from-accountantsworld.html

payroll

and assuming payroll is the needle, it will return

reviews/<a href = "blah">payroll</a>-relief-from-accountantsworld.html <a href = "blah">payroll</a>

how can i keep word boundary from ignoring the / slash when searching for substrings? thanks

Code: Select all

 
$words[$i] = '/\b'.$words[$i].'\b/';
 
$content  = preg_replace($words, $Links, $content);
 
User avatar
jgadrow
Forum Newbie
Posts: 22
Joined: Wed Jun 17, 2009 7:56 pm
Location: Cincinnati, Ohio
Contact:

Re: PHP Regex word boundary pattern ignores / in urls

Post by jgadrow »

I believe you'd have to build a recursive subpattern into it to perform this. Basically, you can't use \b because the definition of such is that it checks for a character before or after the current character with a different property than the current character.

So, if the current character is a \w character, it matches on the first \W character it finds. You would need to recursively match on: (\w|\/) so that you're matching any 'word' character plus the forward slash. FOLLOWING a \W character. And the same BEFORE a \W character.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: PHP Regex word boundary pattern ignores / in urls

Post by prometheuzz »

A negative look behind can be used in this case:

Code: Select all

$text = 'reviews/payroll-relief-from-accountantsworld.html,payroller,payroll';
echo preg_replace('#\b(?<!/)payroll\b#i', 'FOO', $text);
Post Reply