Memory usage reduced with short attribute names ?!?

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Memory usage reduced with short attribute names ?!?

Post by Mark Baker »

Can anybody explain why memory usage seems better when using shorter attribute names?

I have the following test class:

Code: Select all

$loopCount = 1024 * 512;
 
error_reporting(E_ALL);
 
 
class testClass
{
    public $veryLongAttributeName0 = NULL;
    public $veryLongAttributeName1 = NULL;
    public $veryLongAttributeName2 = NULL;
    public $veryLongAttributeName3 = NULL;
    public $veryLongAttributeName4 = NULL;
    public $veryLongAttributeName5 = NULL;
    public $veryLongAttributeName6 = NULL;
    public $veryLongAttributeName7 = NULL;
    public $veryLongAttributeName8 = NULL;
    public $veryLongAttributeName9 = NULL;
 
 
    function __construct($libraryType = null, $fileID = null)
    {
    }
 
    function _destructor()
    {
    } // function destructor
 
}   //  class testClass
 
 
$callStartTime = microtime(true);
 
$testArray = array();
for ($i = 0; $i < $loopCount; ++$i) {
    $testArray[] = new testClass();
}
 
$callEndTime = microtime(true);
$callTime = $callEndTime - $callStartTime;
echo '<br />Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";
 
 
echo date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 / 1024).' MB<br />';
which produces the following result:

Code: Select all

Call time to instantiate 524288 objects of testClass was 3.4530 seconds
09:34:08 Peak memory usage: 494.75 MB
 
If I simply change the attribute names

Code: Select all

    public $veryLongAttributeName0 = NULL;
    public $veryLongAttributeName1 = NULL;
    public $veryLongAttributeName2 = NULL;
    public $veryLongAttributeName3 = NULL;
    public $veryLongAttributeName4 = NULL;
    public $veryLongAttributeName5 = NULL;
    public $veryLongAttributeName6 = NULL;
    public $veryLongAttributeName7 = NULL;
    public $veryLongAttributeName8 = NULL;
    public $veryLongAttributeName9 = NULL;
 
to

Code: Select all

    public $v0 = NULL;
    public $v1 = NULL;
    public $v2 = NULL;
    public $v3 = NULL;
    public $v4 = NULL;
    public $v5 = NULL;
    public $v6 = NULL;
    public $v7 = NULL;
    public $v8 = NULL;
    public $v9 = NULL;
 
I get the following result:

Code: Select all

Call time to instantiate 524288 objects of testClass was 3.2260 seconds
09:37:46 Peak memory usage: 374.75 MB
 
It appears to run fractionally faster (although that's harder to determine), but uses significantly less memory (494.75 MB reduced to 374.75 MB)
That shouldn't be right... should it? Even in a semi-compiled language such as PHP, the memory usage (and possibly speed of execution) shouldn't be affected by the length of an attribute name.
User avatar
PHPHorizons
Forum Contributor
Posts: 175
Joined: Mon Sep 14, 2009 11:38 pm

Re: Memory usage reduced with short attribute names ?!?

Post by PHPHorizons »

It makes sense to me that longer variable names means longer execution times.

How many times did you run the test?
It seems to me that running it a few hundred times is the only way to get conclusive results. If you only ran it one, the result is anecdotal.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Memory usage reduced with short attribute names ?!?

Post by pickle »

Without knowing a whole lot about how the PHP interpreter works, I think I know why it takes more memory. When you instantiate the object, it also sets it's properties. Obviously it will take more memory to store the string "veryLongAttributeName0" than "v0". My initial thought was that behind-the-scenes, PHP would convert both of them to internal memory pointers, and it probably does. However, it's possible the interpreter doesn't release the memory it initially used to load "veryLongAttributeName0", until after the script is completed. That would explain the memory usage. As for the time, it probably takes longer for PHP to interpret "veryLongAttributeName0" than "v0".
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
User avatar
PHPHorizons
Forum Contributor
Posts: 175
Joined: Mon Sep 14, 2009 11:38 pm

Re: Memory usage reduced with short attribute names ?!?

Post by PHPHorizons »

If property names are not stored in a hash table, it would make callback functions very difficult to achieve.

Code: Select all

array_map(array($this, 'veryLongPropertyName'), $some_array);
(That would be valid with lambda functions ;) )

It also shows that method names must have their actual names stored in a hash table as well. Unless of course every string instance of a method/property is converted. But that would probably be impossible to do.
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: Memory usage reduced with short attribute names ?!?

Post by Mark Baker »

I've run the tests a good few times now, using different versions of PHP and different operating platforms.... the memory differences are a pretty constant 120MB on Windows, and 100MB on Linux. And other people have replicated those results, so (while not definitive) it's a little more than simply anecdotal.


Thinking about this, I'm beginning to understand the reasoning behind it.

The bytecode PHP still needs to know the actual variable/attribute and function/method names for use in error handling and serialize() (among others): lambda functions are another good example.

I was assuming that for OOP, it would handle this slightly differently, and maintain a "class definition" with all the long name details, and each instance would just contain pointers to this name map; so that each instance would use a minimal amount of memory, and functions such as serialize would cross reference the data and pointers from the instance with the class name map to generate their output.

However, that could only work if all attributes were predefined in the class definition.... but PHP's loose coding rules allow you to define new attributes or even methods dynamically within a script.... against a specific instance of the class, so these couldn't exist in a "class map" in advance. Therefore, PHP takes the quick and dirty approach of holding the names within each instance.


PHPExcel has an instantiated object for every cell in every worksheet in a workbook. With large Excel files, that can easily hit several million instantiated cell objects.... and yes, we do hit memory problems problems with large files that we've been working hard to alleviate.
We're already looking at a form of cacheing so that cell instances are only memory resident when they're actually needed.
That would reduce the problem, although it does have speed implications that we need to look at as well. If we can get cell cacheing working with minimal overhead, then it's less of an issue.... we'd be able to work with one instance of the cell object in memory at any given time, swapping attribute values in and out as necessary.
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: Memory usage reduced with short attribute names ?!?

Post by Mark Baker »

Thanks to everybody that has provided help and advice on this problem. Based on the suggestions that have been given, I've come up with the following code using magic getters/setters:

Code: Select all

 
class testClass
{
    private static $_propertyList = array( 'longVariableName0',
                                    'longVariableName1',
                                    'longVariableName2',
                                    'longVariableName3',
                                    'longVariableName4',
                                    'longVariableName5',
                                    'longVariableName6',
                                    'longVariableName7',
                                    'longVariableName8',
                                    'longVariableName9'
                                  );
    private $_data = array();
 
 
    public function __set($name, $value) {
        $key = array_search($name,self::_propertyList);
        if ($key !== false) {
            $this->_data[$key] = $value;
        }
    }
 
    public function __get($name) {
        $key = array_search($name,self::_propertyList);
        if ($key !== false) {
            return $this->_data[$key];
        }
    }
 
    function __construct()
    {
    }
 
    function _destructor()
    {
    } // function destructor
 
}   //  class testClass
 
Compared with my original script (running on the same server)
Original script with long property/attribute names:
Call time to instantiate 524288 objects of testClass was 3.1759 seconds
09:48:34 Peak memory usage: 494.75 MB
Using magic getters/setters, which allows us to retain long property/attribute names
Call time to instantiate 524288 objects of testClass was 1.8602 seconds
09:48:05 Peak memory usage: 150.75 MB

That's an incredible gain, and we're really grateful to everybody on PHP Developers Network and other forums who has helped us explain the cause of the problem, and provided us with a solution that not only gives us the ability to handle significantly larger volumes of data, but to do so with improved speed as well.

We can apply this technique to many of the classes within the library, which should allow us to handle workbooks up to 3 times the size that we can now, without any changes being required by developers who are using the library.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Memory usage reduced with short attribute names ?!?

Post by Eran »

That's pretty incredible. Who would have thought magical methods could actually improve performance.
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: Memory usage reduced with short attribute names ?!?

Post by Mark Baker »

pytrin wrote:That's pretty incredible. Who would have thought magical methods could actually improve performance.
It's not going to work in every case.... it all comes down to the number of attributes and the length of their names, and the number of instances of the object that are being created.
The magic getters/setter methods do add time overhead against every access to read/write/test an attribute; but offset against a smaller memory footprint (with fewer calls to malloc when instantiating). The potential is their for improving performance, but it'll take a bit more effort using "real world" classes rather than my simplistic test class, and potentially a lot of additional code streamlining to gain real benefits.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: Memory usage reduced with short attribute names ?!?

Post by onion2k »

It's nice to know, but in the real world if your script is instantiating half a million objects you should probably be rethinking your approach anyway.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Memory usage reduced with short attribute names ?!?

Post by Christopher »

pytrin wrote:That's pretty incredible. Who would have thought magical methods could actually improve performance.
This is really interesting.

However, it is not actually clear that it would improve performance. If you look at what was done, Mark shortened 10 property names by 21 characters each and then instantiated 524288 objects. If you calculate 10 * 21 * 524288 you get about 110Mb which is about the difference it the original memory usage numbers. The times are for instantiation, not execution. It does not say whether code will run faster or slower once instantiated. I recall that magic methods are slower than properties.

The question is whether instantiating 524288 objects is a useful real world test? If you reduce the number of objects and increase the number of calls to setters/getters then instantiation time may become a small percentage of execution time.

Note that in his second example if he change _propertyList and _data to _p and _d he would save 7Mb of memory
(#10850)
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: Memory usage reduced with short attribute names ?!?

Post by Mark Baker »

arborint wrote:
pytrin wrote:That's pretty incredible. Who would have thought magical methods could actually improve performance.
This is really interesting.
At the moment it's academic. Producing those results against my simple test class (with its excessively long attribute names) isn't the same as real world code. But the difference in the test class was enough to persuade me that it was an avenue worth exploring as a potential solution to a real world problem.
arborint wrote:However, it is not actually clear that it would improve performance. If you look at what was done, Mark shortened 10 property names by 21 characters each and then instantiated 524288 objects. If you calculate 10 * 21 * 524288 you get about 110Mb which is about the difference it the original memory usage numbers. The times are for instantiation, not execution. It does not say whether code will run faster or slower once instantiated. I recall that magic methods are slower than properties.
In the real world, it isn't so cut and dried. The magic setter/getter methods do increase the code size, and the memory benefits of reducing the property names may not be sufficient to offset this additional code size.

You're right, times are for instantiation, because I was simply testing the memory footprint at this point - the initial issue was all about the difference in memory usage between long and short names - because we had been hitting memory problems.
While speed is important, our current problem is memory usage when people try to work with very large workbooks. This solution may slow the code, but if it reduces the memory footprint by a significant amount, then that might be an overhead which can be justified.
arborint wrote:The question is whether instantiating 524288 objects is a useful real world test? If you reduce the number of objects and increase the number of calls to setters/getters then instantiation time may become a small percentage of execution time.
No it isn't a useful real world test, and that's why I'm running additional tests using the method with our real world code, looking at where it might benefit, and what the trade offs are.
It isn't always practical to reduce the number of objects... at least, not without scrapping the OOP approach completely; and the mechanism may not be appropriate to some of our classes.

arborint wrote:Note that in his second example if he change _propertyList and _data to _p and _d he would save 7Mb of memory
_propertyList is static, and it would seem from my experimentation that statics are maintained within the global namespace, so only a single copy exists, no matter how many instances there are of the object in which that static property is defined. Reducing _data to _d (or even d) would reduce memory usage still further.
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Memory usage reduced with short attribute names ?!?

Post by josh »

What happens if you set a memory limit right below the threshold of what you expect it to use, does lengthening your variable names set it off or does PHP detect the limit getting closer and free up old un-used memory?
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

Re: Memory usage reduced with short attribute names ?!?

Post by BDKR »

arborint wrote:
pytrin wrote: The question is whether instantiating 524288 objects is a useful real world test? If you reduce the number of objects and increase the number of calls to setters/getters then instantiation time may become a small percentage of execution time.
Isn't this the kind of circumstance where the Flyweight pattern could be of use?

The link below does a good job of explaining. :D
http://www.javacamp.org/designPattern/flyweight.html
Post Reply