Hi everyone
I'm designing a simple content management system for our website running on our intranet Apache server, and updating data stored on our host's MySQL server.
This is all working fine.
There will only be a couple of users that will have access to the system, but I'm looking for a way to make sure that they enter valid HTML into the CMS.
I already have some primitive checks on special characters:
- Comparing the number of & with the number of & to make sure they match. That ensures that all & are properly entity-ised
- Making sure the number of " chars is even
- Making sure the number of < equals the number of >
Obviously #2 is flawed because when people type quotes in text they tend to enclose the quote with "", which still results in an even count.
#1 and #3 seem to be fairly sound though, if not technically accurate/complete.
Anyone know of any methods/classes out there that can interface with the W3C validator and return me a true/false on whether the code is valid or not?
I was thinking about using curl to check a link to the page and scan the returned content for any text like "x Errors:" or something.
There must be a better way, surely?
Does the W3C offer an XML RPC interface? Or API or anything?
Thanks, B
HTML validation
Moderator: General Moderators
- jayshields
- DevNet Resident
- Posts: 1912
- Joined: Mon Aug 22, 2005 12:11 pm
- Location: Leeds/Manchester, England
Re: HTML validation
I think HTMLPurifier might fulfill your needs. Check it out.
Re: HTML validation
Well, I think so this will be helpful for you . htmlspecialchars
check it out .
check it out .
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Re: HTML validation
If you don't mind disallowing stuff like forms and scripts, HTML Purifier will do the job perfectly. Otherwise, you'll probably want to look at the following options:
- Remotely requesting the w3c validator service to figure out if the page is valid
- Running the input through HTML Tidy and seeing what happens
- Parsing it with DOM and then running a DTD validation on it
- Enabling HTML Purifier's trusted mode and seeing if that is good enough
Re: HTML validation
I applaud you for trying to do this yourself.
I would install tinyMCE or FCKeditor. So much easier in my opinion.
I would install tinyMCE or FCKeditor. So much easier in my opinion.
- JAB Creations
- DevNet Resident
- Posts: 2341
- Joined: Thu Jan 13, 2005 6:44 pm
- Location: Sarasota Florida
- Contact:
Re: HTML validation
PHP + cURL + W3C validator. Export the (X)HTML data to a temporary file, create the URL, validate the URL using cURL. It creates a lot less load and you can let the W3C update their validator instead of doing it yourself. 
Re: HTML validation
Yeah I'm thinking definitely go for some sort of CURL implementation on this.
Although the page lives on our intranet, there is a script on our website which will also output the page, so I will just use that as the link to give to the validator. Then I'll just scan through the text of the returned page looking for something like [Invalid] or xx Errors
Thanks for the info, thought I'd check in case there was anything else I was missing
Thanks, B
Although the page lives on our intranet, there is a script on our website which will also output the page, so I will just use that as the link to give to the validator. Then I'll just scan through the text of the returned page looking for something like [Invalid] or xx Errors
Thanks for the info, thought I'd check in case there was anything else I was missing
Thanks, B