Finding Memory Leaks
Moderator: General Moderators
Finding Memory Leaks
Hi Forum,
I've written an import script for a bigger image database. The script iterates over 20000 database entries and as much
lines of csv file, creates some object and updates all database entries. All in one big for-loop.
My problem is that every loop takes about 20k extra memory eating up all memory I feed it.
I tried to debug it using "get_defined_vars()" but it returns every time the same amount of variables. (I think this is because
get_defined_vars only returns accessible variables.)
Now I'm a little out of options. What can I do to find the root of the problem? Is there a way to output _every_ memory eating bastard
or to access the symbol-tables directly and look for stuff which shouldn't be there? Or are there any php memory debuggers which could
help me?
Help would be very much appreciated.
Best regards
PePa
(PS: Please forgive me if this is the wrong forum but I think the question goes well deep.)
I've written an import script for a bigger image database. The script iterates over 20000 database entries and as much
lines of csv file, creates some object and updates all database entries. All in one big for-loop.
My problem is that every loop takes about 20k extra memory eating up all memory I feed it.
I tried to debug it using "get_defined_vars()" but it returns every time the same amount of variables. (I think this is because
get_defined_vars only returns accessible variables.)
Now I'm a little out of options. What can I do to find the root of the problem? Is there a way to output _every_ memory eating bastard
or to access the symbol-tables directly and look for stuff which shouldn't be there? Or are there any php memory debuggers which could
help me?
Help would be very much appreciated.
Best regards
PePa
(PS: Please forgive me if this is the wrong forum but I think the question goes well deep.)
Hi ReDucTor.
Thanks for the offer but can't do so. The import code is pretty simple, but the framework it is using isn't. It would involve quiet some kilo lines of code and roughly 10 classes. But perhaps to illustrate the problem some pseudo-code: (The application is an online image database: http://www.altrofoto.de so the objects are images)
"Image" is the big thing. Image is composed of many other classes. (References in all directions).
My guess was that some references didn't get released and so the garbage-collection (does php have a garbage-collection anyways?)
can't free the memory. But all tries to unset references have failed so far. The one thing that I need is some tool/trick/command which
will tell me what variables php know of and why it won't forget them...
Best regards
PePa
Thanks for the offer but can't do so. The import code is pretty simple, but the framework it is using isn't. It would involve quiet some kilo lines of code and roughly 10 classes. But perhaps to illustrate the problem some pseudo-code: (The application is an online image database: http://www.altrofoto.de so the objects are images)
Code: Select all
open file.csv
for( $line of file.csv ){
// Image from csv
$i = new Image();
$i->readFrom( $line );
// Image from db
$idb = (Image) select image from imagedb where $i.id = id
case( compare $i $idb ){
changed: update $i in database
new: insert $i in database
}
do some more db things to connect images with other stuff
throw away all stuff which scope ends here, of course.
}My guess was that some references didn't get released and so the garbage-collection (does php have a garbage-collection anyways?)
can't free the memory. But all tries to unset references have failed so far. The one thing that I need is some tool/trick/command which
will tell me what variables php know of and why it won't forget them...
Best regards
PePa
Hi pepa,
incidentally I'm currently working on a large PHP-based framework as well, which - at one point - began leaking lots of memory. It took two days of heavy profiling and debugging, to solve the problem. There are a few things to look for, also in regards to your description.
Maybe there is some pointer for you, that helps in your situation.
Good luck!
incidentally I'm currently working on a large PHP-based framework as well, which - at one point - began leaking lots of memory. It took two days of heavy profiling and debugging, to solve the problem. There are a few things to look for, also in regards to your description.
- Database query results should be freed as soon as they are not needed anymore; especially in a loop with many iterations.
- Objects with circular references leak memory, due to a PHP bug (http://bugs.php.net/bug.php?id=33595). If you have a parent object containing a child object which has a back-reference to its parent, both objects cannot be deleted by PHP's garbage collector. To solve this problem, you can write a destructor method for your child class, which unsets the parent reference. Make explicit use of unset where you temporarily create such ring-reference objects. Leaving the scope will litter memory, this also applies to loops:
Code: Select all
while(condition) $a = new RingRefObject; - create_function litters the global scope with "permanent" functions. They are not discarded when the scope of invocation is left. So be careful in making use of this function.
Maybe there is some pointer for you, that helps in your situation.
Good luck!
Hi Paw,
at last someone who knows what I'm talking about
To your post:
1. Are you talking about single rows I iterate over (then there could be a problem) or are you talking about the complete resultset (which i do free indeed)?
2. This is what I guessed, too. I spent quiet some time going over all circular references and unsetting them as well in the main object as in the sub objects to eliminate this possibility.
3. luckily some stuff I haven't used in my project. So this isn't IT.
What me really interests is: What did you do to find your leaks? Only thing I found is "memory_get_usage" to watch my ram fade away and "get_defined_vars" to tell me absolutely nothing ... What would be _very_ usefull was "show_whats_eating_ram" or "show_all_known_vars" but someone forget putting it in the documentation ...
Regards
PePa
at last someone who knows what I'm talking about
To your post:
1. Are you talking about single rows I iterate over (then there could be a problem) or are you talking about the complete resultset (which i do free indeed)?
2. This is what I guessed, too. I spent quiet some time going over all circular references and unsetting them as well in the main object as in the sub objects to eliminate this possibility.
3. luckily some stuff I haven't used in my project. So this isn't IT.
What me really interests is: What did you do to find your leaks? Only thing I found is "memory_get_usage" to watch my ram fade away and "get_defined_vars" to tell me absolutely nothing ... What would be _very_ usefull was "show_whats_eating_ram" or "show_all_known_vars" but someone forget putting it in the documentation ...
Regards
PePa
I meant the whole result set. In our framework, we've got a database result wrapper which automatically frees the result on object destruction (if not already explicitly done by method invocation). Since these result objects have references to database connection objects, they are normally freed at script termination, so actually too late when many queries have been done.pepa wrote:1. Are you talking about single rows I iterate over (then there could be a problem) or are you talking about the complete resultset (which i do free indeed)?
PHP seems to prefer to do its major cleanups and __destruct-calls in the end. However, if no circular references are given, it can also be forced during script execution by using unset. That's what I could see during analysing the problem.
Besides crying and cursingpepa wrote: 2. This is what I guessed, too. I spent quiet some time going over all circular references and unsetting them as well in the main object as in the sub objects to eliminate this possibility.
<snip>
What me really interests is: What did you do to find your leaks? Only thing I found is "memory_get_usage" to watch my ram fade away and "get_defined_vars" to tell me absolutely nothing ... What would be _very_ usefull was "show_whats_eating_ram" or "show_all_known_vars" but someone forget putting it in the documentation ...![]()
At some point I was quite certain of the circular reference problem to be a main cause of the memory leaks. So I tested the suspect classes by doing something like
Code: Select all
function __destruct() {
echo get_class($this),' just passed away<br>';
}In our case, these messages appeared in the end of script execution. But some redesigns and explicit resource freeing/unset calls could solve the leaking problem for most cases. However, a database data migration script which transforms thousands of various records using a quite complex ActiveRecord class, still steadily consumes more and more memory during iteration, even though the "scripted" or intended object count stays the same. But it appears to be neglectable since normal website usage of the framework is stable now.
If your framework does not show memory-leaking behaviour in context of a "normal web application", you could just break up your data processing script so that it processes a fewer amount of images per run.
EDIT: We had many memory-wasting problems in our framework. One of them has been an XmlObject class, which builds tree structures. In order to free the occupied memory during run-time, I had to write an explicit method that recursively unsets all references on destruction. So trees and other linked data structures are also something to look for.
And to answer your question about dedicated profiling tools, I'm afraid, I've no experience with such for myself. But as far as I know, these exist. Quick google search: http://xdebug.org/
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
You could unset() each line after you insert it (or whatever you're doing).
I imported a csv file line by line of canadian + us zip codes with close to a million records and I didn't have a memory problem.
Just annoying long script time.
I imported a csv file line by line of canadian + us zip codes with close to a million records and I didn't have a memory problem.
Just annoying long script time.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
@Paw:
Ah. Good ol' crying and cursing. Tried that too. Didn't work either
But explicitly monitoring the destruction is something I didn't do. I will give it a try and report back if it helped.
But before that I now have a reason to upgrade to php5. Hope it won't create too many new problems ...
@arborint:
would do that but the import has to do many other things to (convert explizit entries to forein key references, convert units, ...)
@scottayy:
my guess is it's not the (my)sql calls but the OR-Mapper which causes the problem. One solution would be not to use it
(instead of using "new Image(); $i->readFromDb()" and "$i->writeToDb()" I could use only the mysql
result and a hand crafted insert query. But the main reason of using ORM is not to have to ...
Ah well and yes, I tried unsetting my Image object after "$i->writeToDb()" but with no luck.
Regards,
PePa
Ah. Good ol' crying and cursing. Tried that too. Didn't work either
But explicitly monitoring the destruction is something I didn't do. I will give it a try and report back if it helped.
But before that I now have a reason to upgrade to php5. Hope it won't create too many new problems ...
@arborint:
would do that but the import has to do many other things to (convert explizit entries to forein key references, convert units, ...)
@scottayy:
my guess is it's not the (my)sql calls but the OR-Mapper which causes the problem. One solution would be not to use it
(instead of using "new Image(); $i->readFromDb()" and "$i->writeToDb()" I could use only the mysql
result and a hand crafted insert query. But the main reason of using ORM is not to have to ...
Ah well and yes, I tried unsetting my Image object after "$i->writeToDb()" but with no luck.
Regards,
PePa
So. Tried everything. Upgraded to 5. Unsetted everything. Even single strings. Freed db results. Installed xdebug and got it running. Even tried crying again. Nothing.
There has to be a way to tell php to tell me where it's wasting memory. My next try would be reimplement the complete import in another programming language which doesn't has that kind of problem.
Regards,
PePa
There has to be a way to tell php to tell me where it's wasting memory. My next try would be reimplement the complete import in another programming language which doesn't has that kind of problem.
Regards,
PePa