XML Help Freeforall: Entities?

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

XML Help Freeforall: Entities?

Post by Ambush Commander »

Since my work has now shifted over to parsing HTML and then coercing it into a valid doctype, I need a more formal knowledge of HTML. This includes related technologies as XHTML, XML, CSS, all related RFCs, DTDs, and SGML. It's a whole lotta work, and it's gonna be a lot of coding and brainthinking. If anyone is interested in the fruits of my work so far, see this DTD aware HTML lexer + parser.

Most of these are well-documented on http://www.w3.org/ which is nice because I'm a big fan of the World Wide Web Consortium. However, SGML's formal specification seems to have disappeared down a black hole.

Now, XML is just a more restrictive subset of SGML, and I mostly plan on forming valid XHTML when I'm done, which can easily be ported to valid HTML 4.01 with a few style changes. I plan on study the XML specification in depth, but I was wondering if it was worth the effort to find a copy of the SGML declaration and get familiar with it, or just scratch that and go solely with XML.
Last edited by Ambush Commander on Mon Dec 12, 2005 8:10 pm, edited 1 time in total.
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

XML.

Here is a quote for ya
Some significant percentage of the pain suffered by the XML development community over the past 5 years is directly attributable to dealing with the legacy of SGML. It has, in other words, turned out to be much harder, much more complex to do "SGML on the Web" than many people thought it would be. A considerable amount of the early traction seized by XML was due to the confluence of two forces: first, the technical maturity of SGML; second, the early to middle years of exuberance about the Web itself.

In various ways then, XML has really been about trying to overcome the legacy of SGML. Perhaps "overcome" isn't quite right; perhaps "modify and contemporize" is better? At any rate, XML has been driven in part by a sense that SGML had things right, but not just right, and that work remains to be done to overcome SGML's failings.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Okay!

Now... time to hijack my own thread...

I don't understand all the entities. Aren't entities stuff like &? So I get a bit confused when the spec starts talking about the other entities %etc. Can someone clarify?
Post Reply