Page 1 of 2
Tracking memory usage
Posted: Sun Jul 29, 2007 8:37 pm
by Ambush Commander
Developers often spend time ironing out slow sections of applications, and profiling is a good way of finding out when problems like this crop up. However, a lot less attention is paid to the memory usage of an application.
In my case, I'm attempting to tune an algorithm that filters HTML (HTML Purifier). Anecdotal reports say that HTML Purifier can use up to a hundred times the amount of memory of the input data. This is simply unacceptable. A lot of it has to do with the overhead of PHP and HTML Purifier, but there may be some lengthy strings that aren't being properly cleaned up by PHP's built-in garbage collection.
Besides performing a complete code audit, is there a utility similar to a profiler that will let me figure out where in the execution the maximum memory is reached?
Posted: Sun Jul 29, 2007 10:11 pm
by Selkirk
Take a look at
execution traces from XDebug.
Please let us know how it goes and what you find.
Posted: Sun Jul 29, 2007 11:29 pm
by Selkirk
Wow. I ran the following program:
Code: Select all
require_once 'HTMLPurifier.php';
$purifier = new HTMLPurifier();
$result = $purifier->purify("");
echo count(get_included_files());
The result of which for me was 130! Holy cow, that's a lot o files. Is that right?
Peak memory usage for the above program was 3.2MB.
I didn't spend any time on this, so I might have screwed something up. I just wanted to see what the traces looked like in XDebug 2.
Posted: Mon Jul 30, 2007 2:09 am
by sike
i have used XDebug and WinCachegrind (or its linux counterpart) a lot lately to profile a huge php application successfully. so i'll second selkirks recommendation (:
cheers
Chris
Posted: Mon Jul 30, 2007 4:02 pm
by Ambush Commander
Played around with XDebug's traces, they have lots of info, but I can't see any way to practically apply it yet. It looks like I'm going to have to build a trace parser or something.
Posted: Mon Jul 30, 2007 4:10 pm
by guitarlvr
would PEAR's
benchmark be of any use to you?
Posted: Mon Jul 30, 2007 7:54 pm
by Ambush Commander
No, because I am attempting to measure memory, not time.
Spiffy image:
Bottom axis is execution time, left axis is memory usage in bytes. Input document size was 65.6 KB, which is large, but not outrageously slow. Timing is not representative due to tracing. Peak memory usage is ~6.3 MB. The initial 4 MB is simply overhead from HTML Purifier's extensive OOP architecture: there's not much I can do about that. The next 2 MB are from the tokenized representation of the HTML.
Starting at 3 sec, our regular strategies, which are the real workhorses of the application, kick in, and the level of memory stays constant, until 8.6 sec, when the HTML is to be generated and, somehow, the memory usage is a lot smaller (I suspect it's because I don't have parallel copies of the arrays running, even though PHP's quite good about re-allocating memory only when it absolutely needs to). Once that finishes, memory drops to pre-parsing levels of 4 MB, the library's overhead.
This is quite sobering, because it means that the token format that represents the life-blood of this application is extremely memory hungry. DOMDocument->loadHTML, on the other hand, miraculously adds only 1 KB to the application's footprint, which makes me think something fishy is going on: it isn't being caught by the tracer until PHP allocates memory for it locally. Which effectively makes XDebug useless.
ARGH!
Posted: Mon Jul 30, 2007 7:59 pm
by feyd
It shouldn't be overly difficult to stick a (real) profiler on php's executable itself. That would give you exacting information on what PHP itself (and all it's children libraries) are using.
Posted: Mon Jul 30, 2007 8:01 pm
by Ambush Commander
In that case, I curse Windows for making it very nearly impossible to debug or even compile executables (still has not gotten PHP to compile). And it wouldn't tell me who's using up all that memory either.
Posted: Mon Jul 30, 2007 8:06 pm
by feyd
The last profiler I used didn't require a compilation to integrate itself that I recall. It simply wrapped itself around the executable and analyzed the commands it would issue.
Posted: Mon Jul 30, 2007 8:10 pm
by Ambush Commander
Posted: Mon Jul 30, 2007 8:17 pm
by feyd
I'm pretty sure we used Intel's.
Posted: Mon Jul 30, 2007 8:18 pm
by Ambush Commander
Mmm... that looks expensive. I'll keep that in mind though.
Posted: Tue Jul 31, 2007 2:50 am
by kyberfabrikken
Ambush Commander wrote:Played around with XDebug's traces, they have lots of info, but I can't see any way to practically apply it yet. It looks like I'm going to have to build a trace parser or something.
Try Xdebug + WinCacheGrind (As sike suggested).
Posted: Tue Jul 31, 2007 6:57 am
by Ambush Commander
I have used the combo before and it's quite powerful, but only in terms of profiling for execution time. It gives no information with regards to memory usage.