I'm no DTD expert, but I think the different between PCDATA and CDATA is PCDATA could have further tags, but CDATA isn't parsed. PCDATA standing for Parsed Character Data...
So in your example if the bold tag couldn't contain any child tags than CDATA would be the better one. If it could it would be PCDATA, but those child tags would have to be declared. And there's also the "ANY" option...
Last edited by shoebappa on Sat Dec 10, 2005 5:17 pm, edited 1 time in total.
Heh, I was hoping you wouldn't see the last part of my previous post before I edited it... I'm pretty sure on the DTD side CDATA doesn't have to be enclosed in the <![CDATA[ ]]> tags, but when validating can't contain tags unless they are escaped by <![CDATA[ ]]> tags.
Technically the snippet wouldn't be XML because it would need a root node <html></html>... Usually when I here DTD I think XML, and then your HTML might or might not be valid XML, and it might or might not validate off whatever DTD you are using.
I think entities are parsed. They are also delcared in the DTD... I know <![CDATA[ ]]> pretty much ignores everything enclosed in them so, & doesn't display & it is &, same goes for tags (nodes)
Last edited by shoebappa on Sat Dec 10, 2005 5:32 pm, edited 1 time in total.
Ah, so that's what I didn't get: Parsed means you parse both tags and entities, but when you're looking at a DOM, you'll have entities parsed but not tags, so it's still technically PCDATA although you're not parsing the tags.
True, XML does have to have a root element, I'm trying to bend rules here because the data I'm going to be receiving will be loaded with errors.