Page 1 of 1

removing union duplicates

Posted: Sun May 21, 2006 3:44 am
by s.dot
I'm trying to run a script on one of my tables to delete records where $a = $b and $b = $a, except for the current record

Code: Select all

$duper = mysql_query("SELECT * FROM `data_f` ORDER BY `id` ASC",$dbc2);

while($dupea = mysql_fetch_assoc($duper)){
	$id = $dupea['id'];
        $a = $dupea['find'];
        $b = $dupea['found'];

	//dupes?
	mysql_query("DELETE FROM `data_f` WHERE `find` = '$a' AND `found` = '$b' AND `id` != '$id'",$dbc2) or die(mysql_error());

	mysql_query("DELETE FROM `data_f` WHERE `found` = '$a' AND `find` = '$b' AND `id` != '$id'",$dbc2) or die(mysql_error());
	
}
This is um... working. But it appears to be removing a lot more records than I expected.

Re: removing union duplicates

Posted: Sun May 21, 2006 4:01 am
by timvw
scottayy wrote:I'm trying to run a script on one of my tables to delete records where $a = $b and $b = $a, except for the current record
Please define your problem better, since the 'is equal to' relation implies that when a equals b, that b will also equal a.

Posted: Sun May 21, 2006 4:07 am
by s.dot
I have it set up so that there are two fields in each record that are an id from another table

Sample Records

Code: Select all

ID    Find    Found
1      1        2
2      5        7
3      2        1

If I looped through those 3 records, I would want the script to delete record #3 because it is a duplicate of record number one (except the fields are reversed)

So if I were looking to delete "duplicates" of record # 1 I would want to delete the obvious duplicate, which would be an excact duplicate of the row (except for id)

Code: Select all

DELETE FROM `table` WHERE `find` = 1 AND `found` = 2 AND `id` != '$currentid'
Then I want to delete records that are duplicates of record #1 but the fields are reversed

Code: Select all

DELETE FROM `table` WHERE `found` = 1 AND `find` = 2 AND `id` != '$currentid'

Posted: Sun May 21, 2006 10:16 pm
by s.dot
anyone? =/

when I run this query

Code: Select all

$numf1 = mysql_query("SELECT count(*) AS `num1` FROM `data_f` WHERE `find` = '$theperson'") or die(mysql_error());

$numf2 = mysql_query("SELECT count(*) AS `num2` FROM `data_f` WHERE `found` = '$theperson'") or die(mysql_error());
I get 433 results.

However, when I run this query...

Code: Select all

$fid_result = mysql_query("SELECT DISTINCT `find` AS `user` FROM `data_f` WHERE `found` = '$theperson' UNION SELECT DISTINCT `found` AS `user` FROM `data_f` WHERE `find` = '$theperson'") or die(mysql_error());
I come up with 281 results.

Posted: Mon May 22, 2006 12:01 am
by Christopher
You could do it by brute force in PHP by:

Code: Select all

query("SELECT DISTINCT `find` FROM `data_f");
while ($row = fetch()) {
    $rows[] = $row['find'];
}
query("SELECT`found` FROM `data_f");
while ($row = fetch()) {
    if (in_array($row['found'], $rows)) {
        query("DELETE FROM `data_f" WHERE found=" . $row['found']);
        unset($rows[$row['found']]);
    }
}

Posted: Tue May 23, 2006 2:25 am
by s.dot
Thanks for the code aborint, but unfortunately that systematically deleted all records. 8O

But, after 7 headaches, and 3 days.. I came up with this script that does what I want. Forgive the variable naming, but after dealing with this for so long, i used the shortest ones possible.

Code: Select all

$r = mysql_query("SELECT `id` FROM `users` ORDER BY `id` ASC",$dbc2);
while($a = mysql_fetch_assoc($r)){
	
	$r2 = mysql_query("
		SELECT `id` AS `id_field`,`find` AS `user`  FROM `data_f` WHERE `found` = '{$a['id']}'
		UNION
		SELECT `id` AS `id_field`,`found` AS `user` FROM `data_f` WHERE `find` = '{$a['id']}'
	",$dbc2) or die(mysql_error());
	
	$a3 = array();
	while($a2 = mysql_fetch_assoc($r2)){
		$a3[$a2['id_field']] = $a2['user'];
	}
	
	$test = array_unique($a3);
	
	if(count($test) > 0){
	
		$ids = array_keys($test);
		$ids_list = implode(',',$ids);
	
		mysql_query("DELETE FROM `data_f` WHERE `id` NOT IN($ids_list) AND `find` = '{$a['id']}'",$dbc2) or die(mysql_error());
		mysql_query("DELETE FROM `data_f` WHERE `id` NOT IN($ids_list) AND `found` = '{$a['id']}'",$dbc2) or die(mysql_error());
	
	}
	
}