HTMLPurifier 1.0.0 stable released

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

User avatar
neophyte
DevNet Resident
Posts: 1537
Joined: Tue Jan 20, 2004 4:58 pm
Location: Minnesota

Post by neophyte »

Nice work AC! Does it support multiple DTD (transitional || strict)?

I was playing with your "test" it box.

I tried this -- opening b tag but someone forgot the b on the other end...

Code: Select all

<b><font>Whatever</font></>
It gave me this for source code output.

Code: Select all

<b><span>Whatever</span>></b>
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

It only supports Transitional currently, Strict is coming soon. There's already quite a bit of code for getting strict to work, for instance, you saw font -> span, that's element transformation code that turns font tags into spans with css styling.

Other than that, everything worked as expected (HTML is well-formed) except for that stray greater than sign, that's from the parser. We are filtering, after all, not trying to guess what the user meant.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Okay, the trunk version supports (X)HTML Strict, test it out here: HTML Purifier live demo.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

1.3.0 released. Lots of goodies:

* (X)HTML Strict now supported
* You can arbitrarily define which elements and attributes to allow by using %HTML.AllowedElements and %HTML.AllowedAttributes.
* Invalid images are now removed, rather than replaced with dud <img src="" alt="Invalid image" /> image (which still results in an extra HTTP request). Revert to previous behavior by setting %Core.RemoveInvalidImg to false.
* Rudimentary URI host blacklisting implemented with %URI.HostBlacklist.
* New directive %URI.Munge, munges URI so you can use some sort of redirector service to avoid PageRank leaks or warn users that they are exiting your site.
* <li value="4"> and <ul start="2"> now allowed in loose mode.
* These new configuration directives: %HTML.BlockWrapper, %HTML.Parent, %URI.DisableExternalResources, %URI.DisableResources and %Attr.DisableURI. Find about these options and more at the configuration documentation.

d11wtq, since you use HTML Purifier for cleaning up emails, you may be especially interested in %URI.DisableResources, i.e. blocking external images.
Post Reply