Page 1 of 1

XHTML 1.1 Modularization for HTML 4.01 parsing

Posted: Fri Feb 02, 2007 10:23 pm
by Ambush Commander
For a while now, I've been informing my parser about HTML's syntax using a PHP file that simulates the XHTML 1.0 DTD. The approach works, but it's not very extensible: a user who would like to add in a custom element is totally out of luck, and even officially-sanctioned flex points like Strict v. Transitional require a spaghetti-mess of conditionals splattered all over the place. There is simply no way to afford the user fine-grained control over the elements.

So I was ruminating on how to fix this problem, and I realized that W3C had already done this in XHTML 1.1, the modularization of XHTML. Every element and related attributes/content-sets are neatly packaged into modules, and you can then select what modules you'd like to allow. With this, I'd be able factor out a lot of the spaghetti code, but ask the user which modules they want (oh, I want to support text and lists, but nothing else.) Users, if they desperately needed certain types of functionality, would be able to implement it themselves and not have to go mucking around the actual code.

However, I am slightly concerned at allegations that XHTML 1.1 breaks backwards-compatibility with the earlier HTML 4.01 and XHTML 1.0 specifications. According to this page, the changes aren't too bad from Strict, and it appears that the Legacy module should enable me to support Transitional elements too, but I am still a little leery. While I suppose my suspicions will only be dispelled once I actually try it out, has anyone had experiences with XHTML 1.1? Is there anything the Legacy module doesn't cover?

(It also strikes me that, if W3C thought well enough about the modularization, all one would have to do is disable the Structure, Applet, Forms (all of them), Object, Frames, Target, Iframe, Metainformation, Scripting, Link and Base modules in an XHTML 1.1 compliant implementation, you'd have "safe" HTML.)

Posted: Sat Feb 03, 2007 4:07 pm
by Ollie Saunders
What are you asking exactly?

Posted: Sat Feb 03, 2007 4:09 pm
by Ambush Commander
Exactly how incompatible is XHTML 1.1 with HTML 4.01?

Posted: Sun Feb 04, 2007 11:46 am
by Ollie Saunders
I'd say about..... 5

Posted: Sun Feb 04, 2007 11:58 am
by Ambush Commander
5... what?

I've already leaped into the frying pan. Quite frankly, the DTD's and XML Schemas are incomprehensible to me due to their level of abstraction, so I've been running on the Abstract Module specifications, which are, unfortunately, not normative and riddled with errors. For example, they specified the global dir attribute as required (this is only the case for bdo), and they've got the inclusions for the core attribute collections messed up (I'm not precisely sure how, though). But other than that, it's working very nicely. We'll see later on what got missed.

Posted: Sun Feb 04, 2007 12:02 pm
by Ollie Saunders
5 of ole's special compatibility pointstm

Sorry I was just joking, I have no idea.
Oh yeah what does normative mean?

Posted: Sun Feb 04, 2007 12:06 pm
by Ambush Commander
Normative = official, binding, establishing a standard

Normative is opposed to informative, which is far more comprehensible but does not actually specify anything.

The funny thing is, Abstract Modules are normative, but then they turn around and say "These expressions should in no way be considered normative or mandatory. They are an editorial convenience for this document. When used in the remainder of this section, it is the expansion of the term that is normative, not the term itself" and I reply "Huh?" ;-)