Page 1 of 1

parsing custom tags

Posted: Fri Sep 28, 2007 4:27 pm
by GeXus
How would you guys suggest parsing custom tags such as:

<tag:name small="SomeName" large="ThisIsSomeName"/>

Basically I would want to get the small and large values for tag:name

I was thinking of using XPath, but just wanted to see what you guys would suggest...

Posted: Sat Sep 29, 2007 8:13 am
by volka
I'd use XSL (including xpath).
http://www.w3schools.com/xsl/xsl_intro.asp

Posted: Sun Sep 30, 2007 10:28 pm
by GeXus
For what the way I'm trying to implement this, I don't think XSL will work.. because I won't actually have an xml document... I'm also finding as I begin to play with DomDocument, that it won't seem to accept foreign tags, that are not standard HTML.

Basically here is what I'm attempting to do...

Code: Select all

$dom = new DomDocument();
$dom->loadHTML("<html><body><tag:date format=\"Y d, M\"></body></html>");
Then I would use XPath to parse through that...

What do you think? Thanks!

Posted: Mon Oct 01, 2007 12:11 am
by alex.barylski
Maybe use the loadXML() instead if you need to load custom tags. I think you need to tinker with the namespace functions too if you plan on using them.

Posted: Mon Oct 01, 2007 12:41 am
by volka
GeXus wrote:because I won't actually have an xml document...
a) Why not? <html><body><tag:date format="Y d, M"></body></html> is a well formed xml document (fragment).
b) the result of loadhtml() is a "normal" dom object, just like the result of load(), loadxml(), ...

Posted: Mon Oct 01, 2007 9:31 am
by GeXus
volka wrote:
GeXus wrote:because I won't actually have an xml document...
a) Why not? <html><body><tag:date format="Y d, M"></body></html> is a well formed xml document (fragment).
b) the result of loadhtml() is a "normal" dom object, just like the result of load(), loadxml(), ...

What if the code is very messy? This is for, what should be, a very simple template system for 3rd parties to create customized templates... So it's possible the html could be missing closing tags, etc.

Posted: Mon Oct 01, 2007 9:38 am
by volka
Since you're developing a new template system there are no relic templates to consider.
Why not forcing the users to do things right? Why spending so much time to fix errors others haven't made yet?

Posted: Mon Oct 01, 2007 9:40 am
by GeXus
volka wrote:Since you're developing a new template system there are no relic templates to consider.
Why not forcing the users to do things right? Why spending so much time to fix errors others haven't made yet?
So you think just check if the LoadXML works and if it doesn't throw an error message? I could do that... but there also will be inline css, which I'm now sure how that will work, I'll have to test it.... also the type of people using this, aren't exactly "HTML Gurus", so chances are there will be errors, and it would be nice to keep it more streamline, then display error messages.


I like how smarty templates work, in that it doesn't matter how anything is formatted, it only checks for the tags in the content... nothing to do with formatting... as long as the tags themselves are formated properly.... however, i don't want to use smarty for this.

Posted: Thu Oct 25, 2007 6:17 pm
by GeXus
I wanted to follow-up on this, it got put on hold, but now I'm back looking into options...

What about using preg_match or preg_replace for this? Any ideas how I would do that for a tag like

<tag:date format="d m y"/>

There could be multiple tags with different formats in the html... This would obviously be replaced with a formated date.

Thanks!

Posted: Thu Oct 25, 2007 6:25 pm
by feyd
Likely, preg_replace_callback(). You would probably want to use a generalized pattern to match any custom tag then parse it further internally in the callback.

Posted: Thu Oct 25, 2007 9:47 pm
by GeXus
Ah yes, perfect... that looks good! thanks :)