XHTML 1.1 Modularization for HTML 4.01 parsing

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

XHTML 1.1 Modularization for HTML 4.01 parsing

Post by Ambush Commander »

For a while now, I've been informing my parser about HTML's syntax using a PHP file that simulates the XHTML 1.0 DTD. The approach works, but it's not very extensible: a user who would like to add in a custom element is totally out of luck, and even officially-sanctioned flex points like Strict v. Transitional require a spaghetti-mess of conditionals splattered all over the place. There is simply no way to afford the user fine-grained control over the elements.

So I was ruminating on how to fix this problem, and I realized that W3C had already done this in XHTML 1.1, the modularization of XHTML. Every element and related attributes/content-sets are neatly packaged into modules, and you can then select what modules you'd like to allow. With this, I'd be able factor out a lot of the spaghetti code, but ask the user which modules they want (oh, I want to support text and lists, but nothing else.) Users, if they desperately needed certain types of functionality, would be able to implement it themselves and not have to go mucking around the actual code.

However, I am slightly concerned at allegations that XHTML 1.1 breaks backwards-compatibility with the earlier HTML 4.01 and XHTML 1.0 specifications. According to this page, the changes aren't too bad from Strict, and it appears that the Legacy module should enable me to support Transitional elements too, but I am still a little leery. While I suppose my suspicions will only be dispelled once I actually try it out, has anyone had experiences with XHTML 1.1? Is there anything the Legacy module doesn't cover?

(It also strikes me that, if W3C thought well enough about the modularization, all one would have to do is disable the Structure, Applet, Forms (all of them), Object, Frames, Target, Iframe, Metainformation, Scripting, Link and Base modules in an XHTML 1.1 compliant implementation, you'd have "safe" HTML.)
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

What are you asking exactly?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Exactly how incompatible is XHTML 1.1 with HTML 4.01?
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

I'd say about..... 5
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

5... what?

I've already leaped into the frying pan. Quite frankly, the DTD's and XML Schemas are incomprehensible to me due to their level of abstraction, so I've been running on the Abstract Module specifications, which are, unfortunately, not normative and riddled with errors. For example, they specified the global dir attribute as required (this is only the case for bdo), and they've got the inclusions for the core attribute collections messed up (I'm not precisely sure how, though). But other than that, it's working very nicely. We'll see later on what got missed.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

5 of ole's special compatibility pointstm

Sorry I was just joking, I have no idea.
Oh yeah what does normative mean?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Normative = official, binding, establishing a standard

Normative is opposed to informative, which is far more comprehensible but does not actually specify anything.

The funny thing is, Abstract Modules are normative, but then they turn around and say "These expressions should in no way be considered normative or mandatory. They are an editorial convenience for this document. When used in the remainder of this section, it is the expansion of the term that is normative, not the term itself" and I reply "Huh?" ;-)
Post Reply