Page 1 of 1

Recusrive/nesting regex for tags/templating system

Posted: Thu May 07, 2009 9:08 am
by Rasputin
I'm trying to create a simple templating system (in PHP) with some regex involved and I've come up against a bit of a sticking point.

In my template files I have custom tags in the format:

Code: Select all

{if:$var}Var is true!{/if:$var}
I pull a few things from this sort of markup:
  • The entire statement itself (to replace later)
  • The name of the $var, in this case 'var' (to check for true/false)
  • The content between the two tags, in this case 'Var is true!'
This is of course quite simple, but the problem lies where I have nested tags like so:

Code: Select all

 
{if:$var_1}var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} {/if:$var_1}
 

Currently I'm using a convoluted PHP loop with multiple calls to preg_match to achieve this. What I'd like is a more efficient single regex statement that can pull everything I need out. With preg_match_all, the desired result would look something like:

Code: Select all

 
Array
(
    [0] => Array
        (
            [0] => {if:$var_1}var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} {/if:$var_1}
            [1] => {if:$var_2}var 2 is true!{/if:$var_2}
        )
 
    [1] => Array
        (
            [0] => var_1
            [1] => var_2
        )
 
    [2] => Array
        (
            [0] => var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} 
            [1] => var 2 is true!
        )
)
 
My main question would be if this is even possible with regex? I understand theres a recursive operator (?R) which I've been trying to get my head around but most of the examples use it as a means to find simple nested sets of brackets - '(1(2(3)4)5)' etc - so I'm not sure if it can actually be applied to this more complex task. Certainly I've been trying to plug it in to no avail, but would it ever work if I got it right or is this task just beyond regex?

I understand this might be a bit of an effort to comprehend so thanks in advance for anyone who takes the time to read it.

Re: Recusrive/nesting regex for tags/templating system

Posted: Thu May 07, 2009 9:53 am
by pickle
What you want is recursion.

Once you've found a match with preg_match(), call your matching function again on the content:

Code: Select all

function findTags($content)
{
  if(preg_match("/funkypattern/",$content,$matches))
  {
    if($matches[1])# Assuming $matches[1] would contain the content between the tags
    {
      findTags($matches[1]);
    }
    else
    {
      # process the tag found in $matches[0];
    }
  }
}

Re: Recusrive/nesting regex for tags/templating system

Posted: Thu May 07, 2009 10:46 am
by Rasputin
Ah, very good, thanks =)

Re: Recusrive/nesting regex for tags/templating system

Posted: Thu May 07, 2009 10:56 am
by s.dot
I've actually been thinking about this a lot and think tokenizing the string would be great. You start with the { and read the string till the } (of course counting and matching all of the in between { and }'s and saving the end result. I think this would be much more efficient than recursion and/or regex.

I'm eager to try it out on my own template system when I get the time!

Re: Recusrive/nesting regex for tags/templating system

Posted: Fri May 08, 2009 1:11 am
by Zyxist
Such recursion will be ineffective, especially in deeper nesting. Just look for the tags, without their content:

Code: Select all

1. {if:$var_1}
2. Some text...
3. {if:$var_2}
4. Some text...
5. {/if:$var_2}
6. Some text...
7. {/if:$var_1}
 
It still allows you to process the template correctly (use the stack to remember the order of opened conditional variables). Moreover, if you want even more power, you can build a tree from these data, like in XML. The advantages:

- Simple regular expressions.
- Can handle nesting of different tag types.
- No recursion!
- The ability to organize the data into a tree.

Re: Recusrive/nesting regex for tags/templating system

Posted: Fri May 08, 2009 6:10 am
by prometheuzz
NVM

Re: Recusrive/nesting regex for tags/templating system

Posted: Fri May 08, 2009 7:31 am
by prometheuzz
Rasputin wrote:...

My main question would be if this is even possible with regex?
Most regex implementations will not let you define such a recursive pattern. But, as you mentioned, PHP has indeed such a feature. If you can call it a feature though: one could argue that introducing such things in a regular language (regex is of course a regular language, or it "should" be) it ceases to be a regular language. Matching opening and closing tokens is more something a true recursive decent parser should do.

But, since you're not matching the same closing tokens (your closing tokens have something in common with their corresponding opening tokens), you can use a bit of look ahead to get the output you more or less search for (not exactly the same!). Try this:

Code: Select all

$text = 'your text...';
if(preg_match_all('#{if:([^}]+)}(?=(.*?){/if:\1})#s', $text, $matches)) {
  print_r($matches);
}

Re: Recusrive/nesting regex for tags/templating system

Posted: Fri May 08, 2009 8:24 am
by Rasputin
Well, I managed to get all of the suggested solutions working in turn, but I ended up going for prometheuzz's because it fit best with what I was trying to do and it seemed simplest (to me, at least).

Thanks to everyone that contributed and helped out.

Re: Recusrive/nesting regex for tags/templating system

Posted: Fri May 08, 2009 8:47 am
by prometheuzz
Rasputin wrote:Well, I managed to get all of the suggested solutions working in turn, but I ended up going for prometheuzz's because it fit best with what I was trying to do and it seemed simplest (to me, at least).

Thanks to everyone that contributed and helped out.
You're welcome.