word boundary question

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
neophyte
DevNet Resident
Posts: 1537
Joined: Tue Jan 20, 2004 4:58 pm
Location: Minnesota

word boundary question

Post by neophyte »

Question: the following regex pattern seems to match online greek and courses in any order. However it will also match asdfagreek fafefonline afaefcourseswer. How do I add word boundaries? I've tried \b\s and can't seem to make it work.

Code: Select all

$txt = 'greek online courses';
	$pattern ='/(course?[s]+\s)?|(online?[s]+\s)?|(greek?[s]+\s)/i';
	if(preg_match($pattern, $subject)){
		echo "correct";
	} else{
		echo "wrong";
	}
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

\b is a boundry metacharacter on it's own, no need for a \s.

side note: your usage of ? may be faulty.. unless you want it to match "course" and "cours" ;)
User avatar
neophyte
DevNet Resident
Posts: 1537
Joined: Tue Jan 20, 2004 4:58 pm
Location: Minnesota

Post by neophyte »

Guess its' very faulty. I'm trying to get it to match each of the words with or with out the s. The words in any order and in upper or lower case.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

$pattern = "#\b(?:course|greek|online)s?\b#i";
that should do what you want.
User avatar
neophyte
DevNet Resident
Posts: 1537
Joined: Tue Jan 20, 2004 4:58 pm
Location: Minnesota

Post by neophyte »

Excellent! Thanks feyd. One more question.

Code: Select all

$pattern = "#\b(?:course|greek|online)s?\b#i";
I'm sure I got the pattern except for the ":" I've been searching for documentation for it. What does the colon represent?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

it represents nothing on its own. In combination with ? (inside parentheses) it represents a grouping that isn't remembered for back-referencing. Technically, it's a mode change command. Empty, it just groups the enclosed data, but doesn't have the engine remember the contents of it.

some examples:
(?i:blah) will use case insensitivity inside the grouping.
(?m:blah) will use multi-line mode inside the grouping.

any of the pattern modifiers are allowed. You can combine them as well:
(?im:blah) will use case insensitivity multi-line mode inside the grouping.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

Quick note to extend on feyd's comments:

Code: Select all

(?-i:pattern)  == Turn's off the "i" modifier if it's used throughout the rest of the pattern
(?-s:pattern) == Turn's off the "s" modifer... you get the idea ;)
User avatar
neophyte
DevNet Resident
Posts: 1537
Joined: Tue Jan 20, 2004 4:58 pm
Location: Minnesota

Post by neophyte »

Yup I'm following you. Thank you both for the brief tutorial! :D 8)
Post Reply