EDIT: This kind of works:
Code: Select all
<\s{0,2}[^a/].*?>(foo[\'s]{0,2})<\s{0,2}[^>].*?>|[^>]\s{0,2}(foo[\'s]{0,2})\s{0,2}[^<]\s{0,2}[^/]
Moderator: General Moderators
Code: Select all
<\s{0,2}[^a/].*?>(foo[\'s]{0,2})<\s{0,2}[^>].*?>|[^>]\s{0,2}(foo[\'s]{0,2})\s{0,2}[^<]\s{0,2}[^/]
It is unclear to me what it is you're trying to match. Can you give a couple of examples for clarity?Benjamin wrote:I'm currently searching google looking for regex that will match text that is NOT inside of anchor tags. I'm not familiar with the negation operators. If anyone can post the expression for me that would be great.
EDIT: This kind of works:
The problem is that it matches (foo)'s instead of (foo's). It needs to match words with 's at the end as well.Code: Select all
<\s{0,2}[^a/].*?>(foo[\'s]{0,2})<\s{0,2}[^>].*?>|[^>]\s{0,2}(foo[\'s]{0,2})\s{0,2}[^<]\s{0,2}[^/]
Possibly, it may be possible without it.arborint wrote:I think you want to make your pattern a sub-pattern and then negate it using the (?!subpattern) syntax.
This regex matches all your examples including strings like "<b><i>foo's</i></b>":Benjamin wrote:I would like to match:
<i>foo</i>
foo
<b>foo</b>
<i>foo's</i>
foo's
But NOT match
<a href="">foo</a>
And it would be great if it went further and didn't match:
<a href=""><i>foo</i></a>
So, essentially it needs to match anything that isn't in an anchor tag.
Code: Select all
(?:<[^a/][^>]*>)*foo(?:'s)?(</[^a]>)*(?!</)(?:...) is a non-capturing-group. The regex engine will not group what is matched by it in $1 (or \1) or some other variable. It makes your regex a bit faster. But if your strings are not large, you can leave it out for in favour of readability.Benjamin wrote:That's perfect. Thank you very much. Can you explain how it works? ie what does ?: and ?! do?
Code: Select all
(?: // open non-capturing group 1
<[^a/][^>]*> // match any opening tag except an opening anchor
) // close non-capturing group 1
* // group 1, zero or more times
foo // match "foo"
(?: // open non-capturing group 2
's // match "'s"
) // close non-capturing group 2
? // group 2, zero or one time
( // open non-capturing group 3
</[^a]> // match any closing tag except a closing anchor
) // close non-capturing group 3
* // group 3, zero or more times
(?! // start negative look ahead
</ // match "</"
) // stop negative look aheadI agree.Benjamin wrote:I'm not sure regex is the best solution for this.
...
Code: Select all
(?:<[^a/][^>]*>)*\bfoo\b(?:'s)?(</[^a]>)*(?!</|[^<>]*>)It works fine as far as I can tell:Benjamin wrote:That's doing the same thing it was doing for me, ...
Code: Select all
$text = "<i>foo</i>
foo
<b>foo</b>
<i>foo's</i>
foo's
<b><i>foo's</i></b>
<a href=\"\">foo</a>
<a href=\"\"><i>foo</i></a>
<a href=\"http://www.domain.com/foo\">some text</a>
xfoox";
preg_match_all("@(?:<[^a/][^>]*>)*\bfoo\b(?:'s)?(?:</[^a]>)*(?!</|[^<>]*>)@", $text, $matches);
print_r($matches);
/* output:
Array
(
[0] => Array
(
[0] => <i>foo</i>
[1] => foo
[2] => <b>foo</b>
[3] => <i>foo's</i>
[4] => foo's
[5] => <b><i>foo's</i></b>
)
)
*/Code: Select all
'#(?:<[^a/][^>]*>)*\bMATCH STRING\b(?:\'s)?(</[^a]>)*(?!</|[^<>]*>)#i'