Recusrive/nesting regex for tags/templating system

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Rasputin
Forum Newbie
Posts: 3
Joined: Thu May 07, 2009 8:40 am

Recusrive/nesting regex for tags/templating system

Post by Rasputin »

I'm trying to create a simple templating system (in PHP) with some regex involved and I've come up against a bit of a sticking point.

In my template files I have custom tags in the format:

Code: Select all

{if:$var}Var is true!{/if:$var}
I pull a few things from this sort of markup:
  • The entire statement itself (to replace later)
  • The name of the $var, in this case 'var' (to check for true/false)
  • The content between the two tags, in this case 'Var is true!'
This is of course quite simple, but the problem lies where I have nested tags like so:

Code: Select all

 
{if:$var_1}var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} {/if:$var_1}
 

Currently I'm using a convoluted PHP loop with multiple calls to preg_match to achieve this. What I'd like is a more efficient single regex statement that can pull everything I need out. With preg_match_all, the desired result would look something like:

Code: Select all

 
Array
(
    [0] => Array
        (
            [0] => {if:$var_1}var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} {/if:$var_1}
            [1] => {if:$var_2}var 2 is true!{/if:$var_2}
        )
 
    [1] => Array
        (
            [0] => var_1
            [1] => var_2
        )
 
    [2] => Array
        (
            [0] => var 1 is true! {if:$var_2}var 2 is true!{/if:$var_2} 
            [1] => var 2 is true!
        )
)
 
My main question would be if this is even possible with regex? I understand theres a recursive operator (?R) which I've been trying to get my head around but most of the examples use it as a means to find simple nested sets of brackets - '(1(2(3)4)5)' etc - so I'm not sure if it can actually be applied to this more complex task. Certainly I've been trying to plug it in to no avail, but would it ever work if I got it right or is this task just beyond regex?

I understand this might be a bit of an effort to comprehend so thanks in advance for anyone who takes the time to read it.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Recusrive/nesting regex for tags/templating system

Post by pickle »

What you want is recursion.

Once you've found a match with preg_match(), call your matching function again on the content:

Code: Select all

function findTags($content)
{
  if(preg_match("/funkypattern/",$content,$matches))
  {
    if($matches[1])# Assuming $matches[1] would contain the content between the tags
    {
      findTags($matches[1]);
    }
    else
    {
      # process the tag found in $matches[0];
    }
  }
}
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
Rasputin
Forum Newbie
Posts: 3
Joined: Thu May 07, 2009 8:40 am

Re: Recusrive/nesting regex for tags/templating system

Post by Rasputin »

Ah, very good, thanks =)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Re: Recusrive/nesting regex for tags/templating system

Post by s.dot »

I've actually been thinking about this a lot and think tokenizing the string would be great. You start with the { and read the string till the } (of course counting and matching all of the in between { and }'s and saving the end result. I think this would be much more efficient than recursion and/or regex.

I'm eager to try it out on my own template system when I get the time!
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Zyxist
Forum Contributor
Posts: 104
Joined: Sun Jan 14, 2007 10:44 am
Location: Cracow, Poland

Re: Recusrive/nesting regex for tags/templating system

Post by Zyxist »

Such recursion will be ineffective, especially in deeper nesting. Just look for the tags, without their content:

Code: Select all

1. {if:$var_1}
2. Some text...
3. {if:$var_2}
4. Some text...
5. {/if:$var_2}
6. Some text...
7. {/if:$var_1}
 
It still allows you to process the template correctly (use the stack to remember the order of opened conditional variables). Moreover, if you want even more power, you can build a tree from these data, like in XML. The advantages:

- Simple regular expressions.
- Can handle nesting of different tag types.
- No recursion!
- The ability to organize the data into a tree.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Recusrive/nesting regex for tags/templating system

Post by prometheuzz »

NVM
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Recusrive/nesting regex for tags/templating system

Post by prometheuzz »

Rasputin wrote:...

My main question would be if this is even possible with regex?
Most regex implementations will not let you define such a recursive pattern. But, as you mentioned, PHP has indeed such a feature. If you can call it a feature though: one could argue that introducing such things in a regular language (regex is of course a regular language, or it "should" be) it ceases to be a regular language. Matching opening and closing tokens is more something a true recursive decent parser should do.

But, since you're not matching the same closing tokens (your closing tokens have something in common with their corresponding opening tokens), you can use a bit of look ahead to get the output you more or less search for (not exactly the same!). Try this:

Code: Select all

$text = 'your text...';
if(preg_match_all('#{if:([^}]+)}(?=(.*?){/if:\1})#s', $text, $matches)) {
  print_r($matches);
}
Rasputin
Forum Newbie
Posts: 3
Joined: Thu May 07, 2009 8:40 am

Re: Recusrive/nesting regex for tags/templating system

Post by Rasputin »

Well, I managed to get all of the suggested solutions working in turn, but I ended up going for prometheuzz's because it fit best with what I was trying to do and it seemed simplest (to me, at least).

Thanks to everyone that contributed and helped out.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Recusrive/nesting regex for tags/templating system

Post by prometheuzz »

Rasputin wrote:Well, I managed to get all of the suggested solutions working in turn, but I ended up going for prometheuzz's because it fit best with what I was trying to do and it seemed simplest (to me, at least).

Thanks to everyone that contributed and helped out.
You're welcome.
Post Reply