Page 1 of 1

Make it stop!

Posted: Wed Jul 06, 2005 4:32 pm
by is_blank
An elementary question again--I've got text in this format:
Firstname Lastname (born Month, day, Year) was blah blah blah. Blah Blah blah blah, etc. etc.
I'd like to cut out the (born M D, Y), which may appear in any different format: (b. M D, Y), (b. M D, Y - d. M D, Y), etc. I was matching with a simple pattern

Code: Select all

/\(.*\)/
the text is never that long...they're just little biographical blurbs. I've run into this, though:
Firstname Lastname (born Month, day, Year) was blah blah blah. Blah Blah blah (Blah BLAH blah!) blah blah, etc. etc.
so now the above pattern matches (born Month, day, Year) was blah blah blah. Blah Blah blah (Blah BLAH blah!) instead of just the first bit.

What's the trick to getting the match to stop after the first ")", instead of going on?

Thanks!

Posted: Wed Jul 06, 2005 5:01 pm
by Chris Corbyn
It's the evil greedy dot star :P

We all pull hair out over this one at some point lol.

It's greedy, i.e. the dot matches anything until theres nothing to match. Stop it being greedy by combining the star * with a "?".

Code: Select all

/\(.*?\)/
That same principle works with any of the quantifiers, i.e. "+", "*" and "{n,m}" ;)

Posted: Wed Jul 06, 2005 5:02 pm
by Burrito
try using {1} after your pattern you should.

Posted: Wed Jul 06, 2005 11:17 pm
by is_blank
Cool. I think I get it. Thanks!
(Er, Cool it is? Get it, I do? :D)

Posted: Thu Jul 07, 2005 4:26 am
by Chris Corbyn
Hmmm how do I explain greediness simply? To be honest, it confuses everyone ;)

OK lets personify the regex pattern a bit and off we go....

Running against the string Hello World!.

Scenario 1 - /(.*)\s\w+!/
Hello, I'm a regex pattern and I like to eat anything you tell me. Let me introduce my fellow (meta)characters.

.* Meet dotstar: He can eat anything he likes any number of times
\s Meet whitespace: She just eats the things that don't use ink when printing
\w Meet wordchar: He can eat any letter, number or underscore.

OK I'm a hungry regex pattern so I'm gonna start eating this string now.

Off you go dotstar...
dotstar: Oh, a letter H I can eat that, and "e" I'll have that too, and "l", and "l" and "o" and this space - well I can eat ANYTHING ANY number of times so I'll have that too, and this lovely "W" and the "o" and the "r", then this "l" and the "d" and the exclamation mark I can eat that too. Doh! I've eaten it all :( I'm still hungry too, I could have just eaten and eaten and eaten til there was nothing left.

Your turn whitespace...

whitespace: :cry: Hmmm there's nothing for me to eat, that greedy piggy dotstar went and eat everything again!

The same goes for wordchar too.
Scenario 2 - /(.*?)\s\w+!/
Off you go dotstar...

dotstar: Hey! Someone stuck this question mark on me, huh... that's not fair, I have to let someone else eat today. Well I'll see what I'm allowed anyway.
Yummy, a "H", oh hold on a minute, who's next in the queue? Oh that's OK, it's whitespace and she can't eat this anyway so I'll have it. Look, I can eat this "e" too cos whitespace can only eat things that dont use ink when printing. I'll have the "l", and the next "l", and the "o". Oh, what's this? A space.... ermm, whitespace, you hungry?

...
whitespace: Yes! Oh and look, there's a space for me too eat! Yummy. What's next? :cry: Just a "W", I can't eat that, it give me a really bad stomach ache. wordchar, do you want it?
...
wordchar: Yippee.... It's my go at last!. OK I'll eat this "W" and the "o" and the "r" and the "l" and the "d" but I can't eat this exclamation mark... I'm not allowed so I'm told.

The "!" in the pattern matches the "!" in the string and it's done
Basically... stick a question mark on a quantifier and you tell that bit of the pattern to check if the following metacharacter (or part of the pattern) is able to match the string before going ahead and matching it anyway.

Posted: Fri Jul 08, 2005 6:54 pm
by Burrito
d11 was a children's book writer in his past life...

Posted: Fri Jul 08, 2005 7:04 pm
by patrikG
and is feeding random quotes into a database to get it to write children books now.