Page 1 of 1

Regex Exclusion

Posted: Tue Aug 29, 2006 11:10 pm
by Benjamin
I have been mucking with this for several hours. Is it even possible to conditionally match text?

I want to match this..

#blah anything goes here#

But I don\\\'t want it to match this..

#blah anything | goes here#

Posted: Wed Aug 30, 2006 6:17 am
by feyd
Possible, yes. Can we give you a working pattern.. I don't think so. I believe we'll need several more examples of matching and not matching subjects.

Posted: Wed Aug 30, 2006 9:01 am
by Oren
feyd wrote:I believe we'll need several more examples of matching and not matching subjects.
Or perhaps a definition for what you would like to allow and what is not allowed.

Posted: Wed Aug 30, 2006 1:43 pm
by nickvd
based on what you've already posted

Code: Select all

/#(.*?)#/
should be all you need

Posted: Thu Aug 31, 2006 5:55 am
by Benjamin
nickvd wrote:based on what you've already posted

Code: Select all

/#(.*?)#/
should be all you need
That won't work because it will match even if there is a | between the #'s.

I already solved this issue a different way, but I'm still not clear on how to do exclusions in regex.

For example, if I want to match every thing encased in # signs, but only if it does not contain a pipe ( | ), how is this done? I've been reading regex documentation and haven't been able to figure it out.

Posted: Thu Aug 31, 2006 6:13 am
by Oren
Try this:

Code: Select all

$regex = '@#([^\|]*)#@U';
But you really need to give us a better definition so we can help better.

Posted: Thu Aug 31, 2006 6:19 am
by Benjamin
Hmm, that works. How does it work? I tried /#.*?[^\|].*?#/ and it didn't work. What did I do wrong?

Posted: Thu Aug 31, 2006 7:33 am
by Oren
astions wrote:Hmm, that works. How does it work? I tried /#.*?[^\|].*?#/ and it didn't work. What did I do wrong?
First, the two .*? are useless since the [^\|] already means "any character except pipes".
Second, I used the U (PCRE_UNGREEDY) modifier - here is what the manual says about the U modifier:

U (PCRE_UNGREEDY)

This modifier inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by "?". It is not compatible with Perl. It can also be set by a (?U) modifier setting within the pattern or by a question mark behind a quantifier (e.g. .*?).


Assuming we work with this string: just a test #catch me# and #catch me too#, without the U modifier the regex would have caught this:

catch me# and #catch me too

With the U modifier it will catch these:

1. catch me
2. catch me too

Hope that helps :wink:

Posted: Thu Aug 31, 2006 1:07 pm
by Benjamin
Ok so it was getting picked up by the .+?. I get it now. Thank you.