Page 1 of 1

Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 11:55 am
by AngusM
I'm wondering if it is sometimes faster to use references to reference data that is normally retrieved using complex dereferencing procedures.
I'm only a greenhorn PHP programmer, and a veteran intermediate C++ programmer. So it seemed to make sense to me to have a code block like this:

Code: Select all

$caveman = &$Bedrock["Flintstone"];
Bowl($caveman);
AttendWaterbuffaloMeeting($caveman);
FakeIllness($caveman);
...
I expected that dereferencing a reference 3 times was much less expensive than dereferencing an associate array 3 times, since I assume that associative arrays are very expensive to dereference.
A colleague of mine, who is a novice PHP programmer, told me that in his experience this would not be optimal at all. I found this very hard to understand, assuming that the under the hood, a reference in PHP references the data directly. If my friend is right, that would imply something very primitive like a PHP reference references the complex dereference operation, and therefore in dereferencing a reference, the associative array is dereferences 3 times! So how does it work?

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 1:29 pm
by Christopher
Yes, it was hard to get my head around coming from C/C++ too. The problem with scripting languages is that, unlike C/C++, they do all sorts of smart things under the hood. So if you try to be smart you can often make things slower. In your example, assignments are very common so they are optimized internally so that they are not always actually copying data -- sometimes they are internal references. References on the other hand are not very commonly used in PHP, so when you use them the internals are essentially working around all the optimizations.

It can be frustrating, but it is best to code it "straight" and then figure out bottlenecks later. Design right, code fast, optimize later.

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 1:36 pm
by AngusM
arborint wrote:Yes, it was hard to get my head around coming from C/C++ too. The problem with scripting languages is that, unlike C/C++, they do all sorts of smart things under the hood. So if you try to be smart you can often make things slower. In your example, assignments are very common so they are optimized internally so that they are not always actually copying data -- sometimes they are internal references. References on the other hand are not very commonly used in PHP, so when you use them the internals are essentially working around all the optimizations.

It can be frustrating, but it is best to code it "straight" and then figure out bottlenecks later. Design right, code fast, optimize later.
I'm afraid I don't quite get the bottom-line of what you are saying. Which is more optimal? This:

Code: Select all

$caveman = &$Bedrock["Flintstone"];
Bowl($caveman);
AttendWaterbuffaloMeeting($caveman);
FakeIllness($caveman);
...
or this:

Code: Select all

Bowl($Bedrock["Flintstone"]);
AttendWaterbuffaloMeeting($Bedrock["Flintstone"]);
FakeIllness($Bedrock["Flintstone"]);
...
? How many associative dereferences are going to happen in both cases? If the 2nd is faster, why?

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 2:14 pm
by Ambush Commander
Since arrays are implemented with binary searches, the time for their usage in PHP is negligible. Nevertheless, I would go with the way that requires the least typing, which would involve accessing the array and allocating a temporary variable, which you then pass to each function. Under the hood, PHP will not reallocate the contents of $Bedrock["Flintstone"] until you change it, which, in your code sample, never happens.

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 2:22 pm
by AngusM
Ambush Commander wrote:Since arrays are implemented with binary searches, the time for their usage in PHP is negligible. Nevertheless, I would go with the way that requires the least typing, which would involve accessing the array and allocating a temporary variable, which you then pass to each function. Under the hood, PHP will not reallocate the contents of $Bedrock["Flintstone"] until you change it, which, in your code sample, never happens.
Binary searches suffer negligible time in PHP? That's a pretty slow interpreter. I was hoping associative arrays would be using hash tables.

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 2:33 pm
by Ambush Commander
Gah, vocabulary blip. I meant hash tables. :-)

Re: Optimize complex dereferencing using references

Posted: Mon Apr 07, 2008 4:09 pm
by Christopher
It is also not clear whether you are discussing using variables that are references to other variables, or passing by reference? Do you want those functions to modify that array element?

Re: Optimize complex dereferencing using references

Posted: Tue Apr 08, 2008 8:13 am
by AngusM
arborint wrote:It is also not clear whether you are discussing using variables that are references to other variables, or passing by reference? Do you want those functions to modify that array element?
No. Like I said, I'm discussing optimization by using references to save on complex dereference operations.

I should also point out that we got a bit off topic by concentrating on how associative arrays were dereferenced under the hood. Let's say we had to dereference more than just an associative array. Like if we had the code:

Code: Select all

$dinosaur = &$prehistory->$Bedrock["Flintstone"].the_dog;
BiteTheMailMan($dinosaur);
JumpOnOwner($dinosaur);
As an abstraction the only dereferencing here is of an object, an associative array, and then another object. The dereferencing operation could easily go much farther than that, especially in this office. References would be more efficient here, wouldn't they?

Re: Optimize complex dereferencing using references

Posted: Tue Apr 08, 2008 11:18 am
by Christopher
Again, no. References in PHP are really not at all like references in C/C++. They are not really meant to speed things up or save memory or any of the things that they are for in some other languages. You can increment a reference through an array or any such tricks in PHP. They are used for a few very specific things, such as pass by reference parameters, but that is about it. They are not used for things like "$dinosaur = &$prehistory->$Bedrock["Flintstone"].the_dog;" and I am not even sure that I understand what that snippet of non-PHP is presenting or what "much farther than that means."

PHP is a language where the entire program -- everything -- is created, run and destroyed in a fraction of a second for each request. Many of the problems of long running programs do not exist, hence the time spent programming solutions for those problems does not exist either.

Re: Optimize complex dereferencing using references

Posted: Tue Apr 08, 2008 12:17 pm
by AngusM
What non-PHP? The only thing I've written in this thread is either PHP or English.
"Much farther than that" means a dereference operation could take infinitely many sub-dereferences. Say we had $prehistory->$Bedrock["Flintstone"].$the_dog.$biology.$microbiology.$germs.$parasites.$dinopepticgerm. That could take a very long time to dereference, and I don't see how a reference to the $dinopepticgerm object wouldn't be faster if dereferencing was done repeatedly.

Re: Optimize complex dereferencing using references

Posted: Wed Apr 09, 2008 12:58 am
by Chris Corbyn
AngusM wrote:What non-PHP? The only thing I've written in this thread is either PHP or English.
"Much farther than that" means a dereference operation could take infinitely many sub-dereferences. Say we had $prehistory->$Bedrock["Flintstone"].$the_dog.$biology.$microbiology.$germs.$parasites.$dinopepticgerm. That could take a very long time to dereference, and I don't see how a reference to the $dinopepticgerm object wouldn't be faster if dereferencing was done repeatedly.
The dot syntax in PHP is for string concatentation only. It's not a dereference operator like -> is. -> in C++ has a different meaning to in PHP. As does &.

The only reason to use references (&) is to offer a way to modify the original variable rather than copying it. There's no optimization at all if you are not going to write to the variable/reference. PHP uses "copy on write" for variable creation, so this:

Code: Select all

$x = 10;
$y = $x;
Does NOT create two copies of the value 10.

But this does:

Code: Select all

$x = 10;
$y = $x; //Not yet copied under-the-surface
$y = 13; //Now it's copied because we need to modify it
The following code should result in an error since you'd be trying to reference a non-variable (i.e. "the string value of $whatever->foo['bar'] with the string value of $x appended to it").

Code: Select all

$ref = &$whatever->foo['bar'].$x;
I think ~arborint was referring to non-PHP syntax for that reason.... it's invalid. In most OO languages the dot is the dereferencing operator, but in PHP it's not.

Re: Optimize complex dereferencing using references

Posted: Wed Apr 09, 2008 1:08 am
by Christopher
Exactly what I meant. And if you are actually doing something like $prehistory->Bedrock["Flintstone"]->the_dog->biology->microbiology->germs->parasites->dinopepticgerm in PHP (not that you can't) then you may have taken a wrong design turn somewhere. ;)

Re: Optimize complex dereferencing using references

Posted: Wed Apr 09, 2008 8:14 am
by AngusM
I see. Now I understand the PHP engine a little better. Thanks :)