Need help optimizing a block of code

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

Ambush Commander wrote:With a few more tricks, I've managed to slash it down to 12%. HTMLPurifier still is slow, but it's not as slow, and I think I'll now start implementing a few more features. (Unless, of course, Feyd says otherwise).
Just curious, but how slow is slow? Taking for example, this posts' page, how long does it take to get cleaned?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Page one of this forum topic, weighing 120KB, takes 6 seconds (note that the actual time spent for the server is about 20 seconds because the data has to be sent there and sent back).

Bottom line is that for important stuff, you can't just drop it in: you'll also need to add a caching layer. That's the price you pay for power. We should get it fast enough to be on-demand for low-traffic sites , because caching is a very BIG change end-users will have to make to have this be viable.

Note that DevNetwork HTML is not a good standard to benchmark the library to: it's a lot larger than normal documents would be, and it also has loads and loads of tables (which is a somewhat expensive operation).
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

6 seconds for 120kb.. that's not at all slow. It's roughly 20 times the size of my average complete page, so unless you're serving thousands of hits per hour, it seems to be quite fast...
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Well, the trouble is when you start serving moderately large documents. I had to write SLOW docs to give ideas on how to speed things up.
bg
Forum Contributor
Posts: 157
Joined: Fri Sep 12, 2003 11:01 am

Post by bg »

Ambush Commander wrote:Page one of this forum topic, weighing 120KB, takes 6 seconds (note that the actual time spent for the server is about 20 seconds because the data has to be sent there and sent back).

Bottom line is that for important stuff, you can't just drop it in: you'll also need to add a caching layer. That's the price you pay for power. We should get it fast enough to be on-demand for low-traffic sites , because caching is a very BIG change end-users will have to make to have this be viable.

Note that DevNetwork HTML is not a good standard to benchmark the library to: it's a lot larger than normal documents would be, and it also has loads and loads of tables (which is a somewhat expensive operation).
120kb I assume includes images? Correct me if I'm wrong. PHP just isn't made for this kind of data grinding. Something like this is much better suited as a PHP extension that can be written in C.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

120kb I assume includes images? Correct me if I'm wrong.
118kb if I save the source to a file and check that filesize. Still a lot.
PHP just isn't made for this kind of data grinding.
True, however...
Something like this is much better suited as a PHP extension that can be written in C.
Well, first of all, I don't know how to write C. :-( Second of all, if anyone wants to port this to C and make it a standard PHP extension, be my guest. A pure PHP solution will still acheive maximum portability, esp. for those people on shared hosting environments.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

..which is why I made SHA256 purely in PHP too. :)

Sorry, Image
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

No, that's on topic. The price you pay for abstraction and portability is performance.
bg
Forum Contributor
Posts: 157
Joined: Fri Sep 12, 2003 11:01 am

Post by bg »

Can you post the xdebug profile dump? I'd be interested in seeing it.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Hmm... I'd have to reprofile the older versions of code for pre-optimization dumps, but I can give you one taken after the optimization.

http://www.thewritingpot.com/media/cach ... 5675463.gz
Post Reply