Little question
Moderator: General Moderators
Little question
I was reading this tutorial, and at some point I came to this regex:
--
"/A[A-Z]*?B/". In English, this means "match an A, followed by only as many capital letters as are needed to find a B."
--
I don't really understand the part that I bolded. Why does it do this? Because of the '*?' after each other or so?
--
"/A[A-Z]*?B/". In English, this means "match an A, followed by only as many capital letters as are needed to find a B."
--
I don't really understand the part that I bolded. Why does it do this? Because of the '*?' after each other or so?
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
I'll do one better and explain the whole thing.
for pattern:
/ has been selected as the pattern delimiter, this particular mark starts the pattern (with the following character.) This character can be any symbol. /, @, and # are often used mostly because they don't appear as characters to match in the patterns too often.
A simply match a capital a
[ this is a metacharacter. It's used as a character class starting mark. The contents following it are allowed in any order matching a single instance, unless other modified.
A-Z match any capital letter
] stops the character class.
* a match modifer. This particular metacharacter matches the previous object (character, character class, grouping) zero or more times unless modified by a ?
? a match modifer. When not after a * or + modifer, it will work against the previous object (character, character class, grouping) to find zero or one instance. When after a * or + it will tell the metacharacter to match the shortest possible set that satisfies the pattern. (Behaviour is reversed if the ungreedy pattern modifier is in effect.)
B match a capital b.
/ is now the ending of the pattern space. The next character(s) are entire pattern modifiers.
Putting it all in plain english: find a capital 'a' followed by any number of other capital letters to the closest capital 'b' anywhere in the string.
for pattern:
Code: Select all
/A[A-Z]*?B/A simply match a capital a
[ this is a metacharacter. It's used as a character class starting mark. The contents following it are allowed in any order matching a single instance, unless other modified.
A-Z match any capital letter
] stops the character class.
* a match modifer. This particular metacharacter matches the previous object (character, character class, grouping) zero or more times unless modified by a ?
? a match modifer. When not after a * or + modifer, it will work against the previous object (character, character class, grouping) to find zero or one instance. When after a * or + it will tell the metacharacter to match the shortest possible set that satisfies the pattern. (Behaviour is reversed if the ungreedy pattern modifier is in effect.)
B match a capital b.
/ is now the ending of the pattern space. The next character(s) are entire pattern modifiers.
Putting it all in plain english: find a capital 'a' followed by any number of other capital letters to the closest capital 'b' anywhere in the string.
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
By default, the regexp parser is "greedy", that is, it will try to take in as many tokens as it can before it has to stop. For instance:
when used on
will match
which is the last possible B it can get, not
which is the first possible B it can get. When it's ungreedy, the regexp will return the latter.
Code: Select all
/[A-Z]+B/Code: Select all
8fsajASDFASBIASDFKBKSB234eCode: Select all
ASDFASBIASDFKBKSBCode: Select all
ASDFASBThanks guy, I think I understand it now.

Just a little new question; how do you 'turn on' the ungreedy pattern modifier?
You couldn't have explained it better, thanks manfeyd wrote: ? a match modifer. When not after a * or + modifer, it will work against the previous object (character, character class, grouping) to find zero or one instance. When after a * or + it will tell the metacharacter to match the shortest possible set that satisfies the pattern. (Behaviour is reversed if the ungreedy pattern modifier is in effect.)
Just a little new question; how do you 'turn on' the ungreedy pattern modifier?
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
It's a pattern modifier: http://us2.php.net/manual/en/reference. ... ifiers.php
/This is the pattern/U
/This is the pattern/U
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US