ole wrote:
Although the \s+ is greedy, and thus matches all the spaces after "foo", when it hits the "bar", it is backtracking one \s since a match is favoured.
How is it backtracking and which of the three variants I posted are you referring too.
While matching the string, PHP's regex engine keep track of all states that have been matched so far (as all NFA-based engines do).
Whenever it reaches a substring that does not match the regex, it "backtracks" to the last state it did match and then tries to match the entire regex again.
This is what I mean:
Code: Select all
regex = /^foo\s+(?!bar)/
text = "foo bar"
state matched string
1 "f"
2 "fo"
3 "foo"
4 "foo "
5 "foo "
now, at this moment the next string in the text is "bar", but since (?!bar)
"forbids" this match, the regex engine backtracks to state 4 and then matches
the 2nd white space with (?!bar), so the overall match is "foo ".
ole wrote:
PHP's regex engine will always try to match the complete regex, so the second \s will be matched by the (?!bar).
Gah? \s cannot match b
It is negative look-ahead, so as long as it's
not "bar", it will match.
Run this:
Code: Select all
if(preg_match('/(?!bar)/',' ')) {
echo "Match!\n";
} else {
echo "No match...\n";
}
// output will be: "Match!"