I'm trying to write a BBCode parser. I've had excellent success so far, except for one sticking point: Tags don't nest properly. Does anyone know why this is and how I can fix it?
I'm using preg_replace, with the modifiers "i" and "s".
Nested BBCode Tags
Moderator: General Moderators
Re: Nested BBCode Tags
There are 10 types of people in this world, those who understand binary and those who don't
- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: Nested BBCode Tags
Regex is not well suited to built entire parsers.
Re: Nested BBCode Tags
There is an interesting PCRE feature which you don't often hear talk about: recursive patterns. Well, to be honest, I have never really used it myself neither. However, you can do some cool stuff with it if you can wrap your head around it.
See http://www.pcre.org/pcre.txt and scroll down to the "recursive patterns" heading.
You can have a look at it, but I guess that it won't build an entire parser for you. While regular expressions are an awesome help in creating a parser, you will need more than just that. I agree with prometheuzz.
See http://www.pcre.org/pcre.txt and scroll down to the "recursive patterns" heading.
You can have a look at it, but I guess that it won't build an entire parser for you. While regular expressions are an awesome help in creating a parser, you will need more than just that. I agree with prometheuzz.
Re: Nested BBCode Tags
Thank you all for your help. However, I recently decided it would be easier to write a parser from scratch than figure out the complexities of PCRE. Prometheuzz made a good point: Regular expressions aren't suited for this sort of thing.
Re: Nested BBCode Tags
Well, Regexp's really won't build a full parser for you
BUT for the specific issue of parsing Nested BBCodes , you can use this
for parsing nested bbcodes use it with preg_replace or preg_replace_callback
you can also extecat bbcode arguments
BUT for the specific issue of parsing Nested BBCodes , you can use this
Code: Select all
/**
* A Template for the recursive tags matcher RE
* it generates it for a given tag ,open bracket and closing one
* $O & $C must be pre-escaped from #'s
* @param String $tag Tag to be parsed recursively
* @param String $O Opeening brackets of tag
* @param String $C Closing brackets of tag
*/
public function Recursive_RE_Generator($tag,$O,$C)
{
$re="#{$O}({$tag}.*?){$C}((?>{$O}(?!/?{$tag}[^{$O}]*?{$C})|[^{$O}]|(?R))*){$O}/{$tag}{$C}#is";
return $re;
}
you can also extecat bbcode arguments
Re: Nested BBCode Tags
Awesome man, thanks. However, I decided to build a parser with no regex whatsoever. It seems to be working fine.
Re: Nested BBCode Tags
Tokenising is the only way to go when parsing anything that has nesting.