HTML validation
Posted: Tue Sep 02, 2008 7:25 am
Hi everyone
I'm designing a simple content management system for our website running on our intranet Apache server, and updating data stored on our host's MySQL server.
This is all working fine.
There will only be a couple of users that will have access to the system, but I'm looking for a way to make sure that they enter valid HTML into the CMS.
I already have some primitive checks on special characters:
- Comparing the number of & with the number of & to make sure they match. That ensures that all & are properly entity-ised
- Making sure the number of " chars is even
- Making sure the number of < equals the number of >
Obviously #2 is flawed because when people type quotes in text they tend to enclose the quote with "", which still results in an even count.
#1 and #3 seem to be fairly sound though, if not technically accurate/complete.
Anyone know of any methods/classes out there that can interface with the W3C validator and return me a true/false on whether the code is valid or not?
I was thinking about using curl to check a link to the page and scan the returned content for any text like "x Errors:" or something.
There must be a better way, surely?
Does the W3C offer an XML RPC interface? Or API or anything?
Thanks, B
I'm designing a simple content management system for our website running on our intranet Apache server, and updating data stored on our host's MySQL server.
This is all working fine.
There will only be a couple of users that will have access to the system, but I'm looking for a way to make sure that they enter valid HTML into the CMS.
I already have some primitive checks on special characters:
- Comparing the number of & with the number of & to make sure they match. That ensures that all & are properly entity-ised
- Making sure the number of " chars is even
- Making sure the number of < equals the number of >
Obviously #2 is flawed because when people type quotes in text they tend to enclose the quote with "", which still results in an even count.
#1 and #3 seem to be fairly sound though, if not technically accurate/complete.
Anyone know of any methods/classes out there that can interface with the W3C validator and return me a true/false on whether the code is valid or not?
I was thinking about using curl to check a link to the page and scan the returned content for any text like "x Errors:" or something.
There must be a better way, surely?
Does the W3C offer an XML RPC interface? Or API or anything?
Thanks, B