removing union duplicates

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

removing union duplicates

Post by s.dot »

I'm trying to run a script on one of my tables to delete records where $a = $b and $b = $a, except for the current record

Code: Select all

$duper = mysql_query("SELECT * FROM `data_f` ORDER BY `id` ASC",$dbc2);

while($dupea = mysql_fetch_assoc($duper)){
	$id = $dupea['id'];
        $a = $dupea['find'];
        $b = $dupea['found'];

	//dupes?
	mysql_query("DELETE FROM `data_f` WHERE `find` = '$a' AND `found` = '$b' AND `id` != '$id'",$dbc2) or die(mysql_error());

	mysql_query("DELETE FROM `data_f` WHERE `found` = '$a' AND `find` = '$b' AND `id` != '$id'",$dbc2) or die(mysql_error());
	
}
This is um... working. But it appears to be removing a lot more records than I expected.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Re: removing union duplicates

Post by timvw »

scottayy wrote:I'm trying to run a script on one of my tables to delete records where $a = $b and $b = $a, except for the current record
Please define your problem better, since the 'is equal to' relation implies that when a equals b, that b will also equal a.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

I have it set up so that there are two fields in each record that are an id from another table

Sample Records

Code: Select all

ID    Find    Found
1      1        2
2      5        7
3      2        1

If I looped through those 3 records, I would want the script to delete record #3 because it is a duplicate of record number one (except the fields are reversed)

So if I were looking to delete "duplicates" of record # 1 I would want to delete the obvious duplicate, which would be an excact duplicate of the row (except for id)

Code: Select all

DELETE FROM `table` WHERE `find` = 1 AND `found` = 2 AND `id` != '$currentid'
Then I want to delete records that are duplicates of record #1 but the fields are reversed

Code: Select all

DELETE FROM `table` WHERE `found` = 1 AND `find` = 2 AND `id` != '$currentid'
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

anyone? =/

when I run this query

Code: Select all

$numf1 = mysql_query("SELECT count(*) AS `num1` FROM `data_f` WHERE `find` = '$theperson'") or die(mysql_error());

$numf2 = mysql_query("SELECT count(*) AS `num2` FROM `data_f` WHERE `found` = '$theperson'") or die(mysql_error());
I get 433 results.

However, when I run this query...

Code: Select all

$fid_result = mysql_query("SELECT DISTINCT `find` AS `user` FROM `data_f` WHERE `found` = '$theperson' UNION SELECT DISTINCT `found` AS `user` FROM `data_f` WHERE `find` = '$theperson'") or die(mysql_error());
I come up with 281 results.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

You could do it by brute force in PHP by:

Code: Select all

query("SELECT DISTINCT `find` FROM `data_f");
while ($row = fetch()) {
    $rows[] = $row['find'];
}
query("SELECT`found` FROM `data_f");
while ($row = fetch()) {
    if (in_array($row['found'], $rows)) {
        query("DELETE FROM `data_f" WHERE found=" . $row['found']);
        unset($rows[$row['found']]);
    }
}
(#10850)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Thanks for the code aborint, but unfortunately that systematically deleted all records. 8O

But, after 7 headaches, and 3 days.. I came up with this script that does what I want. Forgive the variable naming, but after dealing with this for so long, i used the shortest ones possible.

Code: Select all

$r = mysql_query("SELECT `id` FROM `users` ORDER BY `id` ASC",$dbc2);
while($a = mysql_fetch_assoc($r)){
	
	$r2 = mysql_query("
		SELECT `id` AS `id_field`,`find` AS `user`  FROM `data_f` WHERE `found` = '{$a['id']}'
		UNION
		SELECT `id` AS `id_field`,`found` AS `user` FROM `data_f` WHERE `find` = '{$a['id']}'
	",$dbc2) or die(mysql_error());
	
	$a3 = array();
	while($a2 = mysql_fetch_assoc($r2)){
		$a3[$a2['id_field']] = $a2['user'];
	}
	
	$test = array_unique($a3);
	
	if(count($test) > 0){
	
		$ids = array_keys($test);
		$ids_list = implode(',',$ids);
	
		mysql_query("DELETE FROM `data_f` WHERE `id` NOT IN($ids_list) AND `find` = '{$a['id']}'",$dbc2) or die(mysql_error());
		mysql_query("DELETE FROM `data_f` WHERE `id` NOT IN($ids_list) AND `found` = '{$a['id']}'",$dbc2) or die(mysql_error());
	
	}
	
}
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Post Reply