Page 1 of 1
How do you find the second, of the two duplicates?
Posted: Mon Nov 05, 2012 9:45 am
by simonmlewis
Code: Select all
SELECT a,b
FROM tbl
GROUP BY a,b
HAVING COUNT(*)>1;
This script works, but it always shows the FIRST entry to say "this has a duplicate".
How do I find "the duplicate", thus being able to delete all duplicates?
Re: How do you find the second, of the two duplicates?
Posted: Mon Nov 05, 2012 12:28 pm
by Weirdan
If you want to delete duplicates, you don't have to select them. The most efficient way I know is to just add unique key ignoring any errors (if you want to keep the first record from any set of duplicates):
Code: Select all
alter ignore table tbl add unique index a_b_uniq(a, b);
Or, with more control over the order of insertion:
Code: Select all
create table tbl_new like tbl;
alter table tbl_new add unique index a_b_uniq (a,b);
insert ignore into tbl_new select * from tbl order by id asc;
rename table tbl to tbl_old, tbl_new to tbl;
drop table tbl_old;
-- optionally, drop unique index
alter table tbl drop index a_b_uniq;
Re: How do you find the second, of the two duplicates?
Posted: Mon Nov 05, 2012 3:01 pm
by requinix
You can also LEFT JOIN the table against itself on the same data but a different primary key, then keep only the rows where the joined data is null.
Code: Select all
SELECT a, b FROM tbl t1 LEFT JOIN tbl t2 ON t1.a = t2.a AND t1.b = t2.b AND t1.pk != t2.pk WHERE t2.pk IS NULL
Re: How do you find the second, of the two duplicates?
Posted: Tue Nov 06, 2012 3:24 pm
by pickle
If they're duplicates, why does it matter which one you delete?
Re: How do you find the second, of the two duplicates?
Posted: Wed Nov 07, 2012 2:03 am
by simonmlewis
Because I made the mistake of allowing as many as they want, and want only to keep the first one.
Tho you do have a point; if they don't realise the others have gone, why does it matter.