matches unless ...
Moderator: General Moderators
matches unless ...
I'd like to match keywords within HTML-texts. These keywords can already be tagged within a HTML link reference -> <a href=""></a> Then they should not match, so
Examples when looking for the word "money"
"I lost my money belt" -> match
"I lost my <a href='search_id'>blue money belt<\a>" -> no match
"I lost my money, but fortunately found a <a href='search_id'>bank<\a>" -> match
To find the word money ain't the problem with the wordboundery /\bmoney\b/. But preventing it from being found within the link.... I just cannot work ik out.
Any suggestions?
Examples when looking for the word "money"
"I lost my money belt" -> match
"I lost my <a href='search_id'>blue money belt<\a>" -> no match
"I lost my money, but fortunately found a <a href='search_id'>bank<\a>" -> match
To find the word money ain't the problem with the wordboundery /\bmoney\b/. But preventing it from being found within the link.... I just cannot work ik out.
Any suggestions?
- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: matches unless ...
You probably meant to close your anchor tags with </a> instead of <\a>.
If the tags are properly closed using a slash instead of a backslash, then this will work:
If the tags are properly closed using a slash instead of a backslash, then this will work:
Code: Select all
if(preg_match('~\bmoney\b(?![^<]*</)~i', $text)) {
echo 'match';
} else {
echo 'no match';
}Re: matches unless ...
Thanks! Just did a little test and it works! The <\a> was a typo indeed.
I tried (?:) however before. What's the difference with (?!) you are using here?
I tried (?:) however before. What's the difference with (?!) you are using here?
Re: matches unless ...
?: defines a non capturing group, something used just to improve efficiency
?! is negated look ahead
means match 'x' but only if it isnt followed immediately by 'abc'
?! is negated look ahead
Code: Select all
x(?!abc)- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: matches unless ...
(?:) is called a non-capturing group. When you put parenthesis around some characters, like "a(bc)+d", the regex engine "remembers" (or groups) what is matched between those parenthesis. By "remembering" it, the regex engine will consume more time and memory matching your text. So, when you don't use the stuff that is put between parenthesis, you mind as well tell the regex engine to immediately "forget" what is matched between them, thus saving time. You can tell the regex engine to "forget" it by making it a non-capturing group like this: "a(?:bc)+d".javinto wrote:Thanks! Just did a little test and it works! The <\a> was a typo indeed.
I tried (?:) however before. What's the difference with (?!) you are using here?
(?!) is called negative look ahead. For example, if you write "a(?!b)", you will match only an 'a' if there's not a 'b' ahead of it. You might think, 'well, what's the difference between "a(?!b)" and "a[^b]"'? In case of a look around (there's also negative and positive look behind and ahead, which is all called look around) the part inside the look around is not "consumed" by the regex engine. Here's an example: if you have a string "zzazz" and you match is against the pattern "a(?!b)", then only the 'a' is matched, while matching it against "a[^b]", the substring "az" is matched.
Hope that clears things up.
Re: matches unless ...
Wow, thanks guys for the explanations. I did not realize how the regexp grouping actually works.
I suspect I will need those look-ahead functions more often.
Thanks
I suspect I will need those look-ahead functions more often.
Thanks
- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: matches unless ...
Of course it works!javinto wrote:Thanks! Just did a little test and it works! ...
; )
In case you need it, a short explanation:
Code: Select all
\bmoney\b // match 'money' surrounded by word boundaries
(?! // start negative look ahead
[^<]* // matches zero or more characters of any type except '<'
</ // matches the string '</'
) // stop look ahead- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: matches unless ...
All about look arounds: http://www.regular-expressions.info/lookaround.htmljavinto wrote:Wow, thanks guys for the explanations. I did not realize how the regexp grouping actually works.
I suspect I will need those look-ahead functions more often.
Thanks
Note that the entire site is an excellent online resource!