Page 1 of 1

foreach by reference bug

Posted: Wed Jun 13, 2007 10:43 pm
by aredamkrik
Can anyone explain this one to me? I've had similar problems with the way foreach handles values by reference. This is the simplest example I could come up with:

Code: Select all

<?php

$array1 = array('a' => 1);
$array2 = array('b' => 2);
$array3 = array('c' => 3);

$matrix = array($array1, $array2, $array3);

foreach($matrix as &$array)
    $array['d'] = 4;

foreach($matrix as $array)
    print_r($array);
   
?>

This should just add the key / value pair ['d'] = 4 to each array.

The expected output is then:
Array ( [a] => 1 [d] => 4 ) Array ( => 2 [d] => 4 ) Array ( [c] => 3 [d] => 4 )

The actual output is:
Array ( [a] => 1 [d] => 4 ) Array ( => 2 [d] => 4 ) Array ( => 2 [d] => 4 )

My guess:
The first foreach loop disrupts the pointer making the second foreach loop point to the last element twice.

Is anyone able to explain this better or create a simpler example where this happens?

(using PHP 5)

Posted: Wed Jun 13, 2007 11:04 pm
by feyd
Incorrect.. it is the reuse of the reference that creates the problem.

Code: Select all

[feyd@home]>php -r "$matrix = array(array('a'=>1),array('b'=>2),array('c'=>3)); foreach($matrix as &$array) $array['d'] = 4; foreach($matrix as $array) print_r($array);"
Array
(
    [a] => 1
    [d] => 4
)
Array
(
    [b] => 2
    [d] => 4
)
Array
(
    [b] => 2
    [d] => 4
)

[feyd@home]>php -r "$matrix = array(array('a'=>1),array('b'=>2),array('c'=>3)); foreach($matrix as &$array) $array['d'] = 4; unset($array); foreach($matrix as $array) print_r($array);"
Array
(
    [a] => 1
    [d] => 4
)
Array
(
    [b] => 2
    [d] => 4
)
Array
(
    [c] => 3
    [d] => 4
)

[feyd@home]>php -r "$matrix = array(array('a'=>1),array('b'=>2),array('c'=>3)); foreach($matrix as &$array) $array['d'] = 4; foreach($matrix as $arr) print_r($arr);"
Array
(
    [a] => 1
    [d] => 4
)
Array
(
    [b] => 2
    [d] => 4
)
Array
(
    [c] => 3
    [d] => 4
)

Re: Reuse of reference

Posted: Thu Jun 14, 2007 11:08 am
by aredamkrik
My understanding of this:

The variable $array persists after the first foreach loop as a reference pointer to the last array. When the second foreach runs through it's loop, it recognizes $array as on of its iterations causing the loop to exit early. It then executes the print_r($array) immediately following the foreach loop which prints the most recent value in $array. Or however it happens, the existence of $array messes up the indexing of the second foreach loop. So you get $array1, $array2, $array2.

Then the question is, why doesn't foreach automatically unset any vars with the same name as the alias variable?

If I say foreach($matrix as $array), I very definitely don't want to use the $array I defined in a totally separate area at the top of the page. I can't see a good reason for not doing this.

Posted: Thu Jun 14, 2007 1:01 pm
by stereofrog
This is a bug in php. A testcase in case someone bothers to report it:

Code: Select all

$ary = array(1, 2);

foreach($ary as & $loop_var)
	// do absolutely nothing at all
;

echo implode(',', $ary), "\n";

foreach($ary as $loop_var)
	// do absolutely nothing at all
;

echo implode(',', $ary), "\n";
expected:
1,2
1,2

actual:
1,2
1,1

php 5.2.3 / winxp

Posted: Thu Jun 14, 2007 6:52 pm
by feyd
foreach() doesn't know, doesn't care and shouldn't have to care that a variable you are using is a reference prior to evaluation. I don't consider it a bug.

Posted: Fri Jun 15, 2007 4:37 am
by stereofrog
As an ordinary programmer, I don't care how foreach is implemented and what it does or doesn't "know". I just see that absolutely useless empty loop suddenly destroys my array. If this is not a bug, what else would be one?

Posted: Fri Jun 15, 2007 8:19 am
by feyd
This sort of problem will happen whether or not you use foreach as long as you are reusing a reference. It's all a consequence of using references improperly.

Posted: Fri Jun 15, 2007 8:55 am
by stereofrog
feyd, I agree with your premise that this "behavior" is "by design", however our conclusions are different. You (and php group) are saying: "this bug is by design, therefore it's not a bug". I say: "this bug is by design, therefore this design should be fixed".

Posted: Fri Jun 15, 2007 9:14 am
by feyd
Incorrectly reusing references isn't something PHP should baby you on. It can't read your mind; it doesn't know your intentions.

Only consider using references if you're modifying the array during iteration, otherwise there's no point.

Posted: Fri Jun 15, 2007 9:42 am
by stereofrog
I don't see any reason to argue.
If "2+2" were "5" in php, you probably would find a way to advocate that. ;)

Posted: Fri Jun 15, 2007 9:54 am
by volka
stereofrog wrote:If "2+2" were "5" in php, you probably would find a way to advocate that. ;)
That would be inconsistent. The way php handles the reference in foreach is not. Inconvenient maybe but not inconsistent.

Posted: Fri Jun 15, 2007 11:22 am
by aredamkrik
I am using a reference for the correct reason. I want to modify the matrix using a foreach loop. I then proceed to do something which logically makes perfect sense, cycle through the array I just updated and it gives me an unexpected result. If this is by design please tell me the reasoning. I agree with not updating code for people who use it improperly. Please explain to me how this would be improper use of php.

By not unsetting any predefined vars of the name $array (since all php vars are global and they will conflict), we are saying that there's a possibility that we would want to use the global var $array rather than the specific iteration of $array within $matrix. This should never be the case.

Code: Select all

<?php

//Assume $matrix is defined as above
$array = array(1, 2, 3, 4, 5);

foreach($matrix as $array)
    print_r($array);

//Would you ever expect to see this output from within the foreach loop?
//Array ( [0] => 1 [1] => 2 [2] => 3 [3] => 4 [4] => 5 )

?>

For those of you that would argue that a foreach loop should not affect the global var $array:
It already does, because of the nature of a global scope. Even if you are using foreach by value, $array will always be the last iteration of the foreach loop. So why not unset it at the start to avoid this indexing problem?

Another solution?
It would at least be a step in the right direction to issue a PHP warning if the current scope already contains another variable of the same name as the alias var.

Posted: Fri Jun 15, 2007 11:25 am
by volka
try

Code: Select all

<?php
$array1 = array('a' => 1);
$array2 = array('b' => 2);
$array3 = array('c' => 3);

$matrix = array($array1, $array2, $array3);

foreach($matrix as &$array)
  $array['d'] = 4;
unset($array);

foreach($matrix as $array)
  print_r($array);
?>
unset($array); will not remove the actual array but the reference, i.e. $array will not be a reference when the second foreach() starts.

Posted: Fri Jun 15, 2007 4:58 pm
by feyd
volka's code is oddly familiar.... Image

Posted: Fri Jun 15, 2007 7:20 pm
by volka
amazing coincidence Image