Page 1 of 1

Need help with duplicate in array

Posted: Sun Dec 04, 2005 4:58 pm
by Extremest
I have an multi-dimensional array that is filled from a file. I run sort on it to sort the array and then I want to remove the duplicate first values only leaving one of each but also add the value of the second part of the duplicate to the one that is staying.
Example Data

[0][0]aaaaa[0][1]hello
[1][0]aabba[1][1]happy
[2][0]aaaaa[2][1]doh

output

[0][0]aaaaa[0][1]hello doh
[1][0]aabba[1][1]happy

I cannot figure out how to do this other than use the first value as a key and the second one as the value for that key and php gives warnings for that and it takes forever. Can anyone please help me with this.

Posted: Sun Dec 04, 2005 4:59 pm
by John Cartwright
If array_unique does not satisfy what you are trying to do, read the user comments as they offer several functions for multi-dimensional arrays.

Posted: Sun Dec 04, 2005 6:59 pm
by Extremest
I have looked through there but have not found anything that works. If I use the value from the first part of the array as the index then add the second value to the index's value then it will work without the array_unique. Yet this takes forever even if the array is sorted.

Posted: Sun Dec 04, 2005 7:27 pm
by John Cartwright
ohh sorry I misunderstood your question

so lets say you have your unclean array

Code: Select all

function mergeArrayValues($uncleanArray) {
   $cleanArray = array();
   foreach ($uncleanArray as $valueArray) {
		foreach ($valueArray as $key => $value) {
			if (array_key_exists($key, $cleanArray)) {
				$cleanArray[$key] .= ' '.$value;
			}
			else {
				$cleanArray[$key] = $value;
			}
		}
   }
   return $cleanArray;
}

$uncleanArray = array(
	array(
		'aaa' => 'value1'
	),
	array(
		'bbb' => 'value2'
	),
	array(
		'ddd' => 'value3',
	),
	array(
		'aaa' => 'value4',
	)
);

$cleanArray = mergeArrayValues($uncleanArray);

print_r($cleanArray);
will output

Code: Select all

Array
(
    [aaa] => value1 value4
    [bbb] => value2
    [ddd] => value3
)

Posted: Sun Dec 04, 2005 7:41 pm
by Extremest
ok this is the code that will load the values from the text file into the array. If I am doing soemthing wrong please let me know. I am going to see if I can figure out how to get yours to work just not sure cause how you have your array.

Code: Select all

<?php
$handle = fopen("c:/temp/temp/temp.txt", "r");
$sorted = array();
$line_sorted = array();
$i = 0;
while(!feof($handle)){
	$line = fgets($handle);
	$sorted = explode(",",str_replace("\n","",$line));
	$line_sorted[$i][0] = trim($sorted[0]);
	$line_sorted[$i][1] = trim($sorted[1]);
	$i= $i+1;
}
sort($line_sorted);
?>

Posted: Sun Dec 04, 2005 7:48 pm
by John Cartwright
I was trying to fit it around your example...

change

Code: Select all

$line_sorted[$i][0] = trim($sorted[0]);
    $line_sorted[$i][1] = trim($sorted[1]);
to

Code: Select all

$line_sorted[$i] = array(
   trim($sorted[0]) => trim($sorted[1])
);

Posted: Sun Dec 04, 2005 8:10 pm
by Extremest
Thank you very much that seems to have done the trick. Not running to bad now.

Posted: Sun Dec 04, 2005 8:21 pm
by John Cartwright

Code: Select all

$file = file('c:/temp/temp/temp.txt');
$cleanArray = array();

foreach ($file as $line) {
	$lineFeed = explode(",",str_replace("\n","",$line));
	
	if (array_key_exists($lineFeed[0], $cleanArray)) {
		$cleanArray[$lineFeed[0]] .= ' '.$lineFeed[1];
	}
	else {
		$cleanArray[$lineFeed[0]] = $lineFeed[1];
	}     	
}

sort($cleanArray);
$cleanArray = array_map('trim', $cleanArray);

echo '<pre>';
print_r($cleanArray);
echo '</pre>';
This should give you much better performance.

Posted: Mon Dec 05, 2005 7:41 am
by Extremest
I open 256 files and do each one. I tried your second approach and it seems to go a lot slower than your first one. Not sure why. But the first one seems to be working great.

Posted: Mon Dec 05, 2005 9:27 am
by John Cartwright

Code: Select all

$file = file('c:/temp/temp/temp.txt');
$cleanArray = array();

foreach ($file as $line) {
    $lineFeed = explode(",",str_replace("\n","",$line));
    
    if (array_key_exists($lineFeed[0], $cleanArray)) {
        $cleanArray[$lineFeed[0]] .= ' '.trim($lineFeed[1]);
    }
    else {
        $cleanArray[$lineFeed[0]] = trim($lineFeed[1]);
    }         
}

sort($cleanArray);

echo '<pre>';
print_r($cleanArray);


Perhaps that array_map was slowing it down.. I don't see otherwise why this would be slower.. :?

Posted: Mon Dec 05, 2005 5:54 pm
by Extremest
I will try that out. I am going to take the sort out also. That should help it even more. Will let yea know.

Posted: Mon Dec 05, 2005 6:47 pm
by Extremest
ok that is twice as fast for it to do just 5 file went from 8.65 mins to 4.14 mins. Not sure why though when just doing one file it takes it only 10 secs. I unset the array's just in case but that doesn't make a difference. Here is the code I am working with so you can get a feel for all of it.

Code: Select all

$starttime = explode(' ', microtime());
$starttime = $starttime[1] + $starttime[0];
$handle = fopen("c:/temp/temp/in.txt", "a");
$data_path = "k:/";
$sorted = array();
$line_sorted = array();
for($i=0;$i<256;$i++){
  $file_name = "$data_path";
  $file_name .= str_pad(dechex($i),2,"0",STR_PAD_LEFT);
  $file_name .= ".txt";
  $file_name_open[$i] = file($file_name) or die("Cannot open file!");

$cleanArray = array(); 

foreach ($file_name_open[$i] as $line) { 
$lineFeed = explode(",",str_replace("\n","",$line)); 

if (array_key_exists($lineFeed[0], $cleanArray)) { 
$cleanArray[$lineFeed[0]] .= ' '.trim($lineFeed[1]); 
} 
else { 
$cleanArray[$lineFeed[0]] = trim($lineFeed[1]); 
} 
} 

foreach($cleanArray as $key => $val){
	fwrite($handle, "$key,$val\n");
}
unset($cleanArray,$lineFeed);
}
$mtime = explode(' ', microtime());
$totaltime = $mtime[0] + $mtime[1] - $starttime;
printf('Page loaded in %.3f seconds.', $totaltime);

Posted: Mon Dec 05, 2005 8:12 pm
by Extremest
ok if I try to have it do to many it seems to run me out of ram. Not sure what is not unloading though.