Just curious, but how slow is slow? Taking for example, this posts' page, how long does it take to get cleaned?Ambush Commander wrote:With a few more tricks, I've managed to slash it down to 12%. HTMLPurifier still is slow, but it's not as slow, and I think I'll now start implementing a few more features. (Unless, of course, Feyd says otherwise).
Need help optimizing a block of code
Moderator: General Moderators
-
- DevNet Resident
- Posts: 1027
- Joined: Thu Mar 10, 2005 5:27 pm
- Location: Southern Ontario
- Contact:
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Page one of this forum topic, weighing 120KB, takes 6 seconds (note that the actual time spent for the server is about 20 seconds because the data has to be sent there and sent back).
Bottom line is that for important stuff, you can't just drop it in: you'll also need to add a caching layer. That's the price you pay for power. We should get it fast enough to be on-demand for low-traffic sites , because caching is a very BIG change end-users will have to make to have this be viable.
Note that DevNetwork HTML is not a good standard to benchmark the library to: it's a lot larger than normal documents would be, and it also has loads and loads of tables (which is a somewhat expensive operation).
Bottom line is that for important stuff, you can't just drop it in: you'll also need to add a caching layer. That's the price you pay for power. We should get it fast enough to be on-demand for low-traffic sites , because caching is a very BIG change end-users will have to make to have this be viable.
Note that DevNetwork HTML is not a good standard to benchmark the library to: it's a lot larger than normal documents would be, and it also has loads and loads of tables (which is a somewhat expensive operation).
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Well, the trouble is when you start serving moderately large documents. I had to write SLOW docs to give ideas on how to speed things up.
120kb I assume includes images? Correct me if I'm wrong. PHP just isn't made for this kind of data grinding. Something like this is much better suited as a PHP extension that can be written in C.Ambush Commander wrote:Page one of this forum topic, weighing 120KB, takes 6 seconds (note that the actual time spent for the server is about 20 seconds because the data has to be sent there and sent back).
Bottom line is that for important stuff, you can't just drop it in: you'll also need to add a caching layer. That's the price you pay for power. We should get it fast enough to be on-demand for low-traffic sites , because caching is a very BIG change end-users will have to make to have this be viable.
Note that DevNetwork HTML is not a good standard to benchmark the library to: it's a lot larger than normal documents would be, and it also has loads and loads of tables (which is a somewhat expensive operation).
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
118kb if I save the source to a file and check that filesize. Still a lot.120kb I assume includes images? Correct me if I'm wrong.
True, however...PHP just isn't made for this kind of data grinding.
Well, first of all, I don't know how to write C.Something like this is much better suited as a PHP extension that can be written in C.

- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Hmm... I'd have to reprofile the older versions of code for pre-optimization dumps, but I can give you one taken after the optimization.
http://www.thewritingpot.com/media/cach ... 5675463.gz
http://www.thewritingpot.com/media/cach ... 5675463.gz