PHP5 foreach behaviour and some questions about array_keys

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

PHP5 foreach behaviour and some questions about array_keys

Post by BDKR »

I had an interview last week that ultimately sucked when you get right down to it.

One of the things I was asked about during what I think was the 3rd wave of attackers was how I deal with iterating or taversing multi-dimensional arrays. Part of my answer was, of course, foreach.

From there he started talking about how array keys was faster and more efficient than foreach. This is the second time I've heared this. I did'nt believe it the first time but after hearing this squawk, I came home and wrote some code to see if I could get down to the bottom of what it.

First things first: why are people saying this? If it's faster, faster in comparison to what? Does it depend on the nature of the data you are iterating over. I hate being questioned on my ability to express logic so some FRACKIN' (Sorry. :oops: Too much Battlestar Galactica) context would go a long ways.

Anyway, I did some benchmarking and found that using foreach over an array of objects is far faster then array_keys. So which is it? Is their something I'm missing.

And did the behaviour of PHP5 and foreach change? It seems to be passing elements by ref by default now. Before, the default behaviour was to do it by copy.

Cheers
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

PHP5 does everything by reference natively, from what I understand. And I think that the folks that were interviewing you may have been looking for your response countering their claims. Maybe that wanted to know for sure that you knew your stuff.
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

Post by BDKR »

Everah wrote:PHP5 does everything by reference natively, from what I understand. And I think that the folks that were interviewing you may have been looking for your response countering their claims. Maybe that wanted to know for sure that you knew your stuff.
My earliest recollection of PHP5 and iterating over an array of objects using foreach was a disaster. That's why the manual says...
Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. Therefore, the array pointer is not modified as with the each() construct, and changes to the array element returned are not reflected in the original array. However, the internal pointer of the original array is advanced with the processing of the array. Assuming the foreach loop runs to completion, the array's internal pointer will be at the end of the array.

As of PHP 5, you can easily modify array's elements by preceding $value with &. This will assign reference instead of copying the value.
Now you are right that they may have been trying to trick me. Some of the stuff seemed rather strange.

Oh well...
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

Everah wrote:PHP5 does everything by reference natively, from what I understand. And I think that the folks that were interviewing you may have been looking for your response countering their claims. Maybe that wanted to know for sure that you knew your stuff.
I am not sure this is correct. It is my understanding that PHP5 deals with objects using handles -- which are different than references (i.e. no count). Otherwise other variables are handled generally the same in PHP4 and PHP5. Perhaps someone else can shed more light on this.
(#10850)
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

arborint wrote:Perhaps someone else can shed more light on this.
I would certainly appreciate it. I can't effectively answer people's questions if I don't understand that answer myself :oops:.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

My wouldbe answer:

It depends on whether the performance gain amounts to a significant benefit in the target application. If it doesn't, then it's not all that important (premature optimisation), if it is then I don't know but I could easily setup a quick benchmark to check.

Silly questions invite curt responses... If that wasn't good enough, I'd start wondering if the interviewer had a clue. Would the next question ponder the performance benefits of using require() over require_once()? ;).
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

All variables are reference counted in PHP, no exceptions. The only real difference in variables between 4 and 5 is how assignment and passing (of objects) works. Instead of copying, it now passes a reference. If you want a copy of an object, you have to clone it.

All other variables are copied.
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

Post by BDKR »

Maugrim_The_Reaper wrote: It depends on whether the performance gain amounts to a significant benefit in the target application. If it doesn't, then it's not all that important (premature optimisation), if it is then I don't know but I could easily setup a quick benchmark to check.
This is the interesting thing. They are a dead serious OO shop using PHP5, but they are concerned about performance and efficiency to a fault. I've was a little suprised to be honest with you, but oh well.

As for the benchmark, I've allready proven myself via some code that using array_keys over an associative array of objects is slower. The slowest being the built in spl array iterator object and the fastest just to use foreach($array as &$val).

South Florida must be in some different dimension. :roll:
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

feyd wrote:All variables are reference counted in PHP, no exceptions. The only real difference in variables between 4 and 5 is how assignment and passing (of objects) works. Instead of copying, it now passes a reference. If you want a copy of an object, you have to clone it.
I have read a number of places that objects use a different internal data structure called a handle that is different from a reference, though they seem to have similarities. Does anyone know what the specific differences are?
(#10850)
User avatar
dbevfat
Forum Contributor
Posts: 126
Joined: Tue Jun 28, 2005 2:47 pm
Location: Ljubljana, Slovenia

Post by dbevfat »

From php|architect (September 2006), article Is PHP 4 Really Faster Than PHP 5? by Andi Gutmans and Dmitry Stogov:
In PHP 4, objects were treated as primitive data types. On assignment, parameter passing and function returns, the default behaviour was to copy the entire object. In order to avoid this automatic cloning of objects, programmers were required to master by-reference assignment, parameter passing and function returns.
...
In PHP 5, objects are no longer native types, but are represented instead by a handle that refers to the object. The operations mentioned previously [assignments, function returns] no longer auto-clone the object, but its handle, i.e. the value that tells us where the object is located in memory - a simple integer value.
I hope I'm not in some serious violation of some copyright rules for quoting the article. As the authors tell us; the handle gets cloned. I believe this means that handle alone is not enough for reference counting, so I'm guessing that there must be additional logic somewhere, which should work just like good old references.

As for the speed test, see http://www.blueshoes.org/en/developer/php_bench/ and http://www.php.lt/benchmark/phpbench.php for a reference comparison with your benchmarks, although I think the tests are run in PHP 4.

Regards
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Post by jmut »

hm
could someone give me code example on array_keys vs foreach usage.
I have some clue about it but sounds sooo weird and pointless
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

Post by BDKR »

dbevfat wrote: As for the speed test, see http://www.blueshoes.org/en/developer/php_bench/ and http://www.php.lt/benchmark/phpbench.php for a reference comparison with your benchmarks, although I think the tests are run in PHP 4.
A lot of those results were to be expected I suppose. However, my tests are a bit different. I'm not using the while(list()=each()) mechanism and within each iteration there is a sub-loop that's happening where the true differences between various constructs are being tested.

There was one big suprise, but perfectly understandable when you pull over and think about it for a sec. The length of the variable name has a huge effect on performance. I have two seperate tests using the array_keys()/for() loop construct, but one was made faster simply by reducing the lenghts of the names to something ridiculous and unmaintainable. But even doing that, foreach() with the '&' operator was still faster.

So yes, array_keys() does seem to be pretty fast, but not faster so using it doesn't seem to make sense to me as there are just more hoops to jump through to set it up.

Code: Select all

<?php

class testObject
	{	
	protected $y=0;
	protected $z=0;
	var $name='';
	
	public function meth1($val)
		{ $this->y=($val+1); }
		
	public function meth2()
		{ ++$this->z;	}
		
	public function meth3($val)
		{ $this->name=$val; }		
		
	function reset()
		{ 
		$this->y=0;
		$this->z=0;
		}
	}

# For some reason, calling this before getting started has a good effect on the performance 
# of each subsequent call
getmicrotime(); 			
$iters=1000000;
$num_arrays=4;
$list='jean_grey|rouge|storm|sprite|psylocke';
$girls=explode('|', $list);
$girls_size=count($girls);
$girl_objs=array();
for($x=0; $x<$girls_size; ++$x)
	{ 
	for($xx=0; $xx<$num_arrays; ++$xx)
		{
		$girl_objs1[$girls[$x]]=new testObject; 
		$girl_objs1[$girls[$x]]->name=$girls[$x];
	
		$girl_objs2[$girls[$x]]=new testObject; 	
		$girl_objs2[$girls[$x]]->name=$girls[$x];

		$girl_objs3[$girls[$x]]=new testObject; 	
		$girl_objs3[$girls[$x]]->name=$girls[$x];
	
		$girl_objs4[$girls[$x]]=new testObject; 	
		$girl_objs4[$girls[$x]]->name=$girls[$x];	
		}
	}
reset($girls);

############################################################################
# Using the spl arrayIterator
############################################################################
$time_start=0;
$arrayObj = new arrayObject($girl_objs3);
$iterator=$arrayObj->getIterator();
$time_start=getmicrotime();												// Start the timer here
while($x<$iters)
	{ 
	$iterator->rewind();
	while($iterator->valid())
		{
		$girl=$iterator->current();
		$girl->meth1($x);
		$girl->meth2();
		$iterator->next();
		}
	++$x; 	
	}
echo "Using the spl arrayIterator Ojbect took: " . number_format( ((getmicrotime()) - $time_start),  4) . " seconds.\n\n";
$time_start=0;
############################################################################


############################################################################
# Using the existing key list / array size generated while creating the array of objects
############################################################################
$q=&$girl_objs4;
$z=array_keys($girl_objs4);
$s=sizeof($z);
$time_start=getmicrotime();												// Start the timer here
$x=0;
while($x<$iters)
	{ 
	for($i=0; $i<$s; ++$i)
		{ 
		$q[$z[$i]]->meth1($x);
		$q[$z[$i]]->meth2();		
		}
	++$x; 	
	}
echo "Using array keys and an alias with small var name sizes took: " . number_format( ((getmicrotime()) - $time_start), 4) . " seconds.\n\n";
$time_start=0;
############################################################################


############################################################################
# Using array keys. This is awkward anyways as the object array is essentially an associative array.
############################################################################
$gn=array_keys($girl_objs1);
$j=sizeof($gn);
$time_start=getmicrotime();												// Start the timer here
$x=0;
while($x<$iters)
	{ 
	for($i=0; $i<$j; ++$i)
		{ 
		$girl_objs1[$gn[$i]]->meth1($x);
		$girl_objs1[$gn[$i]]->meth2();
		}
	++$x; 	
	}
echo 'Using array_keys() took: ' . number_format( ((getmicrotime()) - $time_start), 4) . " seconds.\n\n";
$time_start=0;
############################################################################


############################################################################
# Using the plain jane foreach() 
############################################################################
$time_start=getmicrotime();												// Start the timer here
$x=0;
while($x<$iters)
	{ 
	foreach($girl_objs2 as &$girl)
		{ 
		$girl->meth1($x);
		$girl->meth2();
		}
	++$x; 	
	}
echo 'Using foreach() with the \'&\' operator took: ' . number_format( ((getmicrotime()) - $time_start), 4) . " seconds.\n\n";
############################################################################

# Timing courtesy of
function getmicrotime()
  {
  list($usec, $sec) = explode(" ",microtime());
  return ((float)$usec + (float)$sec);
  }

function my_print_r($val)
	{ echo '<pre>'; print_r($val); 'echo </pre>'; }

?>
I ran these tests on the command line using PHP 5.05. I know, it's old as dirt. I've been too busy writing code and working on my car to install 5.2, but I'll
get tht done soon enough.

If there are any problems that anyone can see in the code that has an effect on the performance, please try it or chime in. I've made mistakes in the past with this
stuff so it wouldn't be a first.

Cheers
User avatar
BDKR
DevNet Resident
Posts: 1207
Joined: Sat Jun 08, 2002 1:24 pm
Location: Florida
Contact:

Post by BDKR »

A quick note about the above code. I used a mad number of iterations (1 million I believe) so the differences in performance are more easily grokked.

Cheers
Post Reply