Page 2 of 2

Re: Specifying a doctype

Posted: Mon Feb 12, 2007 6:54 pm
by RobertGonzalez
Ambush Commander wrote:Suppose you were using a library, which could output certain types of HTML. HTML 4 Strict, Transitional, XHTML 1.0, XHTML 1.1, XHTML 2.0, you name it. And you needed to specify which language you wanted the library to output. How would you prefer to identify it?
So I am using a library, as a developer developing an HTML output app, that will eventually produce some form of markup document. Correct? It makes sense to me to utilize names that are the cleanest and easiest to recognize.

Something along the lines of what nickvd posted...

Code: Select all

<?php
$doctypes = array(
  'html401s' => array(
    'name' => 'HTML 4.01 Strict',
    'doctype' => '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">'
  ),
  'html401t' => array(
    'name' => 'HTML 4.01 Loose',
    'doctype' => '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">'
  ),
  'html401f' => array(
    'name' => 'HTML 4.01 Frameset',
    'doctype' => '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">'
  )
  // etc
);
?>
A structure like this seems flexible enough to use as a dropdown record source, a posted data validation set and it is clean. To me anyway.

Posted: Mon Feb 12, 2007 7:48 pm
by Ambush Commander
So I am using a library, as a developer developing an HTML output app, that will eventually produce some form of markup document. Correct?
Close, but there are some fundamental differences. On a library you describe, changing the doctype is as simple as substituting another doctype at the top of the document and then doing some minor aesthetic changes if it's XHTML v. HTML.

In my case, an HTML filter, the output language is very important, because it actually determines the allowed tag set. From my standpoint, the most convenient syntax would tell me:
  • XHTML or HTML? (this determines attribute minimization and trailing slashes in empty tags)
  • Include support for legacy elements? (this would be Transitional for HTML 4.01 / XHTML 1.0, and with the Legacy module for XHTML 1.1/2.0 (actually, the two are subtly different, which makes it tough!))
  • Be lenient with input? (this doesn't fit into any of them, I'm thinking offering a separate parameter)
  • And, of course, which doctype version? (1.0, 1.1 or 2.0)
However, I think I'm misunderstanding the benefits of such an approach. Could you clarify what you mean by:
A structure like this seems flexible enough to use as a dropdown record source, a posted data validation set and it is clean. To me anyway.

Posted: Mon Feb 12, 2007 11:27 pm
by RobertGonzalez
What I mean is that it appears that you are setting up something in which a user will be able to select a doctype and enter some markup, then the app takes the markup, filters it against the doctype and returns a clean markup document.

If that is the case, I was thinking that have a 'list' of selectable doctypes would provide a selection for the user, an indexed list to use as your doctype writer and a list of types that are allowed when demo'ing. I may be thinking on a smaller scale, but I think that having your doctypes in this manner allows for several different uses throughout the application.

Posted: Tue Feb 13, 2007 2:39 pm
by Ambush Commander
Aha. But I'm fairly certain the selection will be directly embedded in some programming code, so no dropdown lists. :-( Perhaps a config-file generator would be pretty interesting, but not yet.

Posted: Tue Feb 13, 2007 2:48 pm
by Kieran Huggins
AC: have you considered using the tidy module? it has built-in support for doctypes

Posted: Tue Feb 13, 2007 2:50 pm
by Ambush Commander
Yeah, I have. I decided it against it because:

1. It's hacky! If I'm building an HTML filtering library, it should darn well be able to handle all this stuff.
2. Tidy is not always available. In fact, the site that hosts HTML Purifier doesn't have Tidy installed.