Page 1 of 2
combining two regexes
Posted: Tue Sep 18, 2007 2:59 pm
by John Cartwright
What I'm trying to do here is capture all the text within the brackets, as well as the text before the bracket.. an example would be
Code: Select all
preg_match('#^([a-zA-Z0-9_])\[#', $name, $outter);
Code: Select all
preg_match_all('#\[([a-zA-Z0-9_])\]#', $name, $inner);
But how on earth can I grab the initial text while using preg_match_all()? I'm simply stumped on this one

Posted: Tue Sep 18, 2007 3:35 pm
by mrkite
Code: Select all
$code="pre[one][two]"
preg_match('{^(\w*)\[(\w*)\]\[(\w*)\]}',$code,$matches);
//$matches[1] = pre
//$matches[2] = one
//$matches[3] = two
Posted: Tue Sep 18, 2007 3:39 pm
by feyd
Code: Select all
feyd:~ feyd$ cat foo2.php
<?php
preg_match_all('#[a-zA-Z0-9_]+(?:\[[a-zA-Z0-9_]+\])*#','foo1[bar1][bar2] foo2 foo3[bar3]', $match);
print_r($match);
feyd:~ feyd$ php -f foo2.php
Array
(
[0] => Array
(
[0] => foo1[bar1][bar2]
[1] => foo2
[2] => foo3[bar3]
)
)
Posted: Tue Sep 18, 2007 5:21 pm
by John Cartwright
Thanks for the replies
Maybe I'm misunderstanding or I didn't quite explain correctly, apologies.
feyd wrote:Code: Select all
#[a-zA-Z0-9_]+(?:\[[a-zA-Z0-9_]+\])*#
Looking at this I can see this will group the (?: ) will group the bracket segment of the subject together (had to read d11's regex crash course

), however I'm trying to capture the value of all the text inside each bracket aswell.
I took I shot at adapting feyd's regex with only limited success.
Code: Select all
#([a-zA-Z0-9_]+)(?:\[([a-zA-Z0-9_]+)\])*#
Using the subject "foobar2[foo][bar]" my results have been:
Code: Select all
Array
(
[0] => foobar2[foo][bar]
[1] => foobar2
[2] => bar
)
Posted: Tue Sep 18, 2007 5:29 pm
by feyd
It can only remember the last bracketed reference unless you add more of the subpattern. To accurately capture all of them without knowing how many there are requires two patterns. One to capture the entire variable reference, the second to capture the contents.
If it's only working with the fully captured variable, using preg_split() could work better.
Posted: Tue Sep 18, 2007 5:31 pm
by John Cartwright
What I needed to know. Thanks feyd.

Posted: Wed Sep 19, 2007 4:00 am
by stereofrog
Code: Select all
$re = '~\w+(?=\[)|(?<=\[)\w+(?=\])~';
$subj = "foo1[bar1][bar2] foo2 foo3[bar3]";
preg_match_all($re, $subj, $m);
print_r($m[0]);
outputs
Code: Select all
Array
(
[0] => foo1
[1] => bar1
[2] => bar2
[3] => foo3
[4] => bar3
)
Is this what you're looking for?
Posted: Wed Sep 19, 2007 4:52 am
by GeertDD
stereofrog wrote:Code: Select all
$re = '~\w+(?=\[)|(?<=\[)\w+(?=\])~';
Is this what you're looking for?
If it is, I think you're needlessly complicating that regex.
Try this pattern:
And if you don't want the spaces:
Posted: Wed Sep 19, 2007 10:52 am
by John Cartwright
Thanks for the follow ups stereofrog and GeertDD.
GeertDD wrote:stereofrog wrote:Code: Select all
$re = '~\w+(?=\[)|(?<=\[)[\w]+(?=\])~';
Is this what you're looking for?
If it is, I think you're needlessly complicating that regex.
Try this pattern:
And if you don't want the spaces:
I've tried the patterns above, however I am not completely getting my desired results. If I have an element with no name, simply foo[bar][] it is being ignored by the regex and only returning foo, bar.. any idea how to modify '/[^\[\]]+/' for blank values?
Again thanks, my regex skills are merely mediocre
Posted: Wed Sep 19, 2007 10:55 am
by feyd
The patterns provided by ~GeertDD are intended for preg_split().
Posted: Wed Sep 19, 2007 12:22 pm
by John Cartwright
I ended up slightly modifying stereofrog's regex,
to allow for empty keys. However, I am still interested in pursuing the preg_split() option. So far I havn't been succesful with it, since passing a string to
foobar[f1][f2][] would be rendered to the following by preg_split()
Code: Select all
Array
(
[0] =>
[1] => [
[2] => ][
[3] => ][]
)
I'm feeling a little bit helpless here, unfortunately

Posted: Wed Sep 19, 2007 12:39 pm
by feyd
would be more for preg_split.
Posted: Wed Sep 19, 2007 12:53 pm
by John Cartwright
feyd wrote:would be more for preg_split.
Okay so this regex will split the input on either ][, [, ].. nice
Still one tiny issue though,
Code: Select all
$this->_inputIndices = preg_split('#\]\[|\[|\]#', 'foobar2f[f1][f][]');
echo '<pre>';
print_r($this->_inputIndices);
Returns:
Code: Select all
Array
(
[0] => foobar2f
[1] => f1
[2] => f
[3] =>
[4] =>
)
When there should only be 4 array elements

I guess this is because it is splitting the last bracket at the end of the string.. Any ideas?
I'm thrilled that I (you guys

) have gotten this process down to a single line of code though
Thanks again.
Posted: Wed Sep 19, 2007 1:36 pm
by GeertDD
feyd wrote:The patterns provided by ~GeertDD are intended for preg_split().
Nope, they aren't. Have a closer look.
Selects every substring that is separated by square brackets or whitespace. It works fine except for when empty square brackets pop up. In that case I agree preg_split() will be a better solution.
Jcart wrote:there should only be 4 array elements

I guess this is because it is splitting the last bracket at the end of the string.. Any ideas?
You'll always end up with one final empty element because the last char of your string is ']'. I tried to cook something up with a lookahead construction to prevent this. No success however. I suggest to just use a function like array_pop() to always chop the last element off the array.
Also feyd's split pattern can be optimized a bit.
Before:
After:
Posted: Wed Sep 19, 2007 1:57 pm
by John Cartwright
GeertDD wrote:No success however. I suggest to just use a function like array_pop() to always chop the last element off the array.
Yea I figured so, and was exactly what I wanted to avoid.. since not all strings supplied will have brackets at all therefore will not have the empty element..
I think I'll stick to the preg_match_all then...
Thanks for all the input guys.