How much of CSS to implement?
Moderator: General Moderators
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
How much of CSS to implement?
For a project I've been working on, I've avoided defining an overarching philosophy which HTML tags are allowed, besides "Don't allow XSS!" As such, I've tended towards keeping even the more obscure HTML elements (q, bdo, tfoot). So I guess that means the library his headed towards the "Allow as much stuff as possible" camp.
Well, it's come back to bite me in the can.
The CSS 2.1 specification defines over one hundred properties, ranging from well used (color, border) to unbelievably obscure (azimuth, richness, table-layout). There's nothing XSS about azimuth, so, by this reasoning, I'll need to implement validation checks for the property. What about mainly layout oriented CSS: widows, page-break-after, cursor?
Not even attributes is this bad (and that's pretty bad: a little less than 200 possible pairs, though I can nuke quite a few because their only used in FORMs and the whatnot).
Combine this with my propensity for well-written code (I could have just made sure the language attribute only had hyphens, letters and numbers, but instead, I read the RFC several times and then implemented all the syntactic constraints, and was mad at myself because I couldn't also package allowed language codes with it), it's starting to look like the release date will need to be pushed to next year.
What do I do? I know there's a software maxim that you must define what you will not implement, but in this case, it's not very clear: I can see how all of these might be useful at one point or another, and one big selling point of the application is that it requires no configuration.
Well, it's come back to bite me in the can.
The CSS 2.1 specification defines over one hundred properties, ranging from well used (color, border) to unbelievably obscure (azimuth, richness, table-layout). There's nothing XSS about azimuth, so, by this reasoning, I'll need to implement validation checks for the property. What about mainly layout oriented CSS: widows, page-break-after, cursor?
Not even attributes is this bad (and that's pretty bad: a little less than 200 possible pairs, though I can nuke quite a few because their only used in FORMs and the whatnot).
Combine this with my propensity for well-written code (I could have just made sure the language attribute only had hyphens, letters and numbers, but instead, I read the RFC several times and then implemented all the syntactic constraints, and was mad at myself because I couldn't also package allowed language codes with it), it's starting to look like the release date will need to be pushed to next year.
What do I do? I know there's a software maxim that you must define what you will not implement, but in this case, it's not very clear: I can see how all of these might be useful at one point or another, and one big selling point of the application is that it requires no configuration.
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
visibone has a nice set of charts. One of them is a CSS chart with browser compatiblity noted using colors. You could start with element which are actually implemented by the majority of browsers rather than just by w3c.
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Thanks for the encouragement. I guess that helps too!Oh wow! I've been wondering what all the lexing and stuff you've been doing was for. This looks awesome Very Happy Can't wait for it to be complete.
I don't think I'm actual going to purchase one, but that's a very good point. Sometimes, I get hung up by some W3C definition that doesn't even work on most major browsers! (like col.char... and it's a cool feature, so it's a little hard to let it die, I'll try to code an attribute transformation that fixes it after everything else is done.)visibone has a nice set of charts. One of them is a CSS chart with browser compatiblity noted using colors.
I'd rather not have to strain my eyes on the chart, but I think a few good Googles will lead me to some useful reference materials.
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK
Ambush Commander gets my respect. This sounds like a great project but a big undertaking. Where are you getting the man power from?
I am an advocate of web standards myself and this is something I am building into my project.
Oh and to answer your question. Implement all of CSS 1 and then all of CSS 2 etc. don't leave things out just because they seem obsecure, if they are in the standards they can be used and therefore abused.
I am an advocate of web standards myself and this is something I am building into my project.
Does that mean you are going to parse DTDs?HTML Purifier takes a different approach, one that doesn't use specification-ignorant regexes or narrow blacklists.
Oh and to answer your question. Implement all of CSS 1 and then all of CSS 2 etc. don't leave things out just because they seem obsecure, if they are in the standards they can be used and therefore abused.
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
It's a one-person project for now. It takes too long to get new developers up to speed and inculcate the with the philosophies of the project. Still, I write profuse documentation about stuff like naming conventions in anticipation that some day I'll have to pass this on to someone else... if it ever gets finished).This sounds like a great project but a big undertaking. Where are you getting the man power from?
I was actually quite surprised to find out that I had finished all the attributes. Creating progress tables helped a lot, and I'll be publishing those soon.
Nope, because the DTDs 1) allow evil stuff (so I'd have to change them anyway) and 2) don't get the standards right! :-ODoes that mean you are going to parse DTDs?
Here's an example: HTML was originally built off SGML, which allows tag exclusions for all descendant elements. You cannot have an A tag nested in an A tag.
XML does not allow similar constraints in their DTDs, so the DTD writers where forced to create specialized allow children definitions for the A tag. The only problem is that "This element is disallowed in all descendant elements" is different from "This element is disallowed in all children elements."
As such, this code "theoretically" valid XHTML 1.0 Strict:
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
<title>test</title>
</head>
<body>
<div>
<a href="about:blank">asdf
<span><a href="about:blank">Test</a></span>
</a>
</div>
</body></html>I base my custom HTML definition off of the DTD, but after that, it's all bets off.
Never thought of that, although it's a little too late... I've already went ahead and categorized all of the CSS properties according to usage and dangerousness.Oh and to answer your question. Implement all of CSS 1 and then all of CSS 2 etc. don't leave things out just because they seem obsecure, if they are in the standards they can be used and therefore abused.
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK
Best of luck with it then.It's a one-person project for now. It takes too long to get new developers up to speed and inculcate the with the philosophies of the project. Still, I write profuse documentation about stuff like naming conventions in anticipation that some day I'll have to pass this on to someone else... if it ever gets finished).
:-O indeed, I'm learning today.Nope, because the DTDs 1) allow evil stuff (so I'd have to change them anyway) and 2) don't get the standards right! :-O
Well that is still useful for deciding which to do inside the standards. Do the most dangerous/used in CSS1 till the les dangerous/used in CSS1 and then move on to CSS2 doing the same.Never thought of that, although it's a little too late... I've already went ahead and categorized all of the CSS properties according to usage and dangerousness.
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
Looks cool man. A lot of this functionality can be found in HTML Tidy, which is written in C and available as a extension for PHP. You may consider creating a wrapper class and letting Tidy do the grinding stuff like fixing tables and bringing html up to spec given the DTD, and then implement whatever other functions are needed to prevent XSS and other exploits. Of course adding the tidy extension as a requirement of your script may not be something you want. At the same time, with some abstraction you could allow it to take advantage of tidy if it is available.
I'm doing something similar with an AJAX framework I'm writing. I have an abstracted class for JSON serialization, which can use either the php_json extension or a json serialization class written in PHP. The code runs perfectly without the json extension, but with it, serialization sees a 2800% performance increase.
Hey and when you take your SVN repos public, I suggest googles code hosting. I got a project up there.
I'm doing something similar with an AJAX framework I'm writing. I have an abstracted class for JSON serialization, which can use either the php_json extension or a json serialization class written in PHP. The code runs perfectly without the json extension, but with it, serialization sees a 2800% performance increase.
Hey and when you take your SVN repos public, I suggest googles code hosting. I got a project up there.
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK
That looks cool, I could use that for my project.Hey and when you take your SVN repos public, I suggest googles code hosting. I got a project up there.
You could help me with mine Everah, I know you'd be good : DAC, you are a freaking stud. I hope someday I can create something cool that developers will use. So many regulars have made so many cool things here that I feel like a useless soul sometimes.
Something tells me AC won't be satisfied with Tidy because its not 100% accurate, in fact probably not 90%.A lot of this functionality can be found in HTML Tidy
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
I've thought about it, and you're right that Tidy is faster. However, it also has its quirks (unacceptable behavior) as well as an important difference: Tidy is meant for repairing entire HTML documents, HTMLPurifier for HTML sections. Although that may be changing later... (There are already some hacks in place to turn a snippet into a fully-fledged document). Part of the reason I shun Tidy is because it's always been regarded by MediaWiki developers as a stop-gap fix, a bandaid for the hideous complexity of their parser.Looks cool man. A lot of this functionality can be found in HTML Tidy, which is written in C and available as a extension for PHP. You may consider creating a wrapper class and letting Tidy do the grinding stuff like fixing tables and bringing html up to spec given the DTD, and then implement whatever other functions are needed to prevent XSS and other exploits. Of course adding the tidy extension as a requirement of your script may not be something you want. At the same time, with some abstraction you could allow it to take advantage of tidy if it is available.
I've done something similar in regards to PHP 5's DOM extension, which can parse HTML, and very quickly too. Use DOMLex when PHP 5 is present, or use DirectLex, a PHP impl.
Performance is going to be a problem, but the heavy optimization will have to happen after everything is written.
Hmm... I think I'm going to have to open another thread about this.Hey and when you take your SVN repos public, I suggest googles code hosting. I got a project up there.
@Everah: Thanks! Remember, you've got a company, so I wouldn't complain. (Not sure how this would be economically viable... I'll sell consulting/customization services or something)
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
The Progress table has been published:
http://hp.jpsband.org/live/docs/progress.html
Scroll down for CSS.
http://hp.jpsband.org/live/docs/progress.html
Scroll down for CSS.
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK