Page 1 of 1

RegEx: Stopping parsing of embedded tags

Posted: Wed May 12, 2010 7:54 am
by lilalfyalien
Hi,

I need some help with the following regex in PHP 5:

[text] <([\w]+)\W*([\w]+)\s*([\[\w\]]+).*>
(.*?)
<\/\1>
[/text]

I am trying to parse the following structure:

[text]
<operator modifier [value]>

<doSomething><operator modifier [value]>****...etc</operator></doSomething>

</operator>

<operator modifier [value]>

<doSomething></doSomething>

</operator>
[/text]
The problem is that the regex is being too greedy and going right into the <doSomething> tags when I just want the first outer matches.

E.g. I want this:

[text]
<if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>

<if unset [monkey]>
<banana></banana>
</if>
[/text]

to return:

[text]
[0][0] => <if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [monkey]
[3][1] => [monkey]
[4][0] => <if set [baby]>
<harness></harness>
</if>
[4][1] => <banana></banana>
[/text]

but it returns:

[text]
[0][0] => <if set [baby]>
<harness></harness>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [baby]
[3][1] => [monkey]
[4][0] => <harness></harness>
[4][1] => <banana></banana>
[/text]

Can someone tell me how I can modify my regex to only return the highest nodes, so to speak?



Thanks in advance!



Lilalfyalien

Re: RegEx: Stopping parsing of embedded tags

Posted: Wed May 12, 2010 10:31 am
by phu
In an eerily similar situation I believe I used the PCRE_UNGREEDY option here:
http://www.php.net/manual/en/reference. ... ifiers.php

Re: RegEx: Stopping parsing of embedded tags

Posted: Thu May 13, 2010 5:50 am
by lilalfyalien
Thanks- I'll give that a go!