Tracking memory usage

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Tracking memory usage

Post by Ambush Commander »

Developers often spend time ironing out slow sections of applications, and profiling is a good way of finding out when problems like this crop up. However, a lot less attention is paid to the memory usage of an application.

In my case, I'm attempting to tune an algorithm that filters HTML (HTML Purifier). Anecdotal reports say that HTML Purifier can use up to a hundred times the amount of memory of the input data. This is simply unacceptable. A lot of it has to do with the overhead of PHP and HTML Purifier, but there may be some lengthy strings that aren't being properly cleaned up by PHP's built-in garbage collection.

Besides performing a complete code audit, is there a utility similar to a profiler that will let me figure out where in the execution the maximum memory is reached?
Selkirk
Forum Commoner
Posts: 41
Joined: Sat Aug 23, 2003 10:55 am
Location: Michigan

Post by Selkirk »

Take a look at execution traces from XDebug.

Please let us know how it goes and what you find.
Selkirk
Forum Commoner
Posts: 41
Joined: Sat Aug 23, 2003 10:55 am
Location: Michigan

Post by Selkirk »

Wow. I ran the following program:

Code: Select all

require_once 'HTMLPurifier.php';

$purifier = new HTMLPurifier();
$result = $purifier->purify("");

echo count(get_included_files());
The result of which for me was 130! Holy cow, that's a lot o files. Is that right?

Peak memory usage for the above program was 3.2MB.

I didn't spend any time on this, so I might have screwed something up. I just wanted to see what the traces looked like in XDebug 2.
sike
Forum Commoner
Posts: 84
Joined: Wed Aug 02, 2006 8:33 am

Post by sike »

i have used XDebug and WinCachegrind (or its linux counterpart) a lot lately to profile a huge php application successfully. so i'll second selkirks recommendation (:

cheers
Chris
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Played around with XDebug's traces, they have lots of info, but I can't see any way to practically apply it yet. It looks like I'm going to have to build a trace parser or something.
User avatar
guitarlvr
Forum Contributor
Posts: 245
Joined: Wed Mar 21, 2007 10:35 pm

Post by guitarlvr »

would PEAR's benchmark be of any use to you?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

No, because I am attempting to measure memory, not time.

Spiffy image:

Image

Bottom axis is execution time, left axis is memory usage in bytes. Input document size was 65.6 KB, which is large, but not outrageously slow. Timing is not representative due to tracing. Peak memory usage is ~6.3 MB. The initial 4 MB is simply overhead from HTML Purifier's extensive OOP architecture: there's not much I can do about that. The next 2 MB are from the tokenized representation of the HTML.

Starting at 3 sec, our regular strategies, which are the real workhorses of the application, kick in, and the level of memory stays constant, until 8.6 sec, when the HTML is to be generated and, somehow, the memory usage is a lot smaller (I suspect it's because I don't have parallel copies of the arrays running, even though PHP's quite good about re-allocating memory only when it absolutely needs to). Once that finishes, memory drops to pre-parsing levels of 4 MB, the library's overhead.

This is quite sobering, because it means that the token format that represents the life-blood of this application is extremely memory hungry. DOMDocument->loadHTML, on the other hand, miraculously adds only 1 KB to the application's footprint, which makes me think something fishy is going on: it isn't being caught by the tracer until PHP allocates memory for it locally. Which effectively makes XDebug useless.

ARGH!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

It shouldn't be overly difficult to stick a (real) profiler on php's executable itself. That would give you exacting information on what PHP itself (and all it's children libraries) are using.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

In that case, I curse Windows for making it very nearly impossible to debug or even compile executables (still has not gotten PHP to compile). And it wouldn't tell me who's using up all that memory either.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

The last profiler I used didn't require a compilation to integrate itself that I recall. It simply wrapped itself around the executable and analyzed the commands it would issue.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Googled: are you referring to this: http://support.microsoft.com/kb/q94209/ ?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I'm pretty sure we used Intel's.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Mmm... that looks expensive. I'll keep that in mind though.
User avatar
kyberfabrikken
Forum Commoner
Posts: 84
Joined: Tue Jul 20, 2004 10:27 am

Post by kyberfabrikken »

Ambush Commander wrote:Played around with XDebug's traces, they have lots of info, but I can't see any way to practically apply it yet. It looks like I'm going to have to build a trace parser or something.
Try Xdebug + WinCacheGrind (As sike suggested).
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

I have used the combo before and it's quite powerful, but only in terms of profiling for execution time. It gives no information with regards to memory usage.
Post Reply