RegEx: Stopping parsing of embedded tags
Posted: Wed May 12, 2010 7:54 am
Hi,
I need some help with the following regex in PHP 5:
[text] <([\w]+)\W*([\w]+)\s*([\[\w\]]+).*>
(.*?)
<\/\1>
[/text]
I am trying to parse the following structure:
[text]
<operator modifier [value]>
<doSomething><operator modifier [value]>****...etc</operator></doSomething>
</operator>
<operator modifier [value]>
<doSomething></doSomething>
</operator>
[/text]
The problem is that the regex is being too greedy and going right into the <doSomething> tags when I just want the first outer matches.
E.g. I want this:
[text]
<if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>
<if unset [monkey]>
<banana></banana>
</if>
[/text]
to return:
[text]
[0][0] => <if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [monkey]
[3][1] => [monkey]
[4][0] => <if set [baby]>
<harness></harness>
</if>
[4][1] => <banana></banana>
[/text]
but it returns:
[text]
[0][0] => <if set [baby]>
<harness></harness>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [baby]
[3][1] => [monkey]
[4][0] => <harness></harness>
[4][1] => <banana></banana>
[/text]
Can someone tell me how I can modify my regex to only return the highest nodes, so to speak?
Thanks in advance!
Lilalfyalien
I need some help with the following regex in PHP 5:
[text] <([\w]+)\W*([\w]+)\s*([\[\w\]]+).*>
(.*?)
<\/\1>
[/text]
I am trying to parse the following structure:
[text]
<operator modifier [value]>
<doSomething><operator modifier [value]>****...etc</operator></doSomething>
</operator>
<operator modifier [value]>
<doSomething></doSomething>
</operator>
[/text]
The problem is that the regex is being too greedy and going right into the <doSomething> tags when I just want the first outer matches.
E.g. I want this:
[text]
<if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>
<if unset [monkey]>
<banana></banana>
</if>
[/text]
to return:
[text]
[0][0] => <if set [monkey]>
<tree><leaf>
<if set [baby]>
<harness></harness>
</if>
</leaf></tree>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [monkey]
[3][1] => [monkey]
[4][0] => <if set [baby]>
<harness></harness>
</if>
[4][1] => <banana></banana>
[/text]
but it returns:
[text]
[0][0] => <if set [baby]>
<harness></harness>
</if>
[0][1] => <if unset [monkey]>
<banana></banana>
</if>
[1][0] => if
[1][1] => if
[2][0] => set
[2][1] => unset
[3][0] => [baby]
[3][1] => [monkey]
[4][0] => <harness></harness>
[4][1] => <banana></banana>
[/text]
Can someone tell me how I can modify my regex to only return the highest nodes, so to speak?
Thanks in advance!
Lilalfyalien