Quick Line for Duplicates
Moderator: General Moderators
- SheDesigns
- Forum Commoner
- Posts: 42
- Joined: Tue Nov 18, 2008 9:51 am
- Location: Buffalo, NY
Quick Line for Duplicates
I have a mailing list of over 500 records, some of which, I know are duplicates. I don't have time to through an monitor them all. The table only has one field so far "address".
How would I write a query to find the entries that are duplicate?
SELECT * FROM `mailinglist` WHERE address ....
How would I write a query to find the entries that are duplicate?
SELECT * FROM `mailinglist` WHERE address ....
Re: Quick Line for Duplicates
Code: Select all
$sql = "SELECT distinct `address` from `mailinglist`";
- SheDesigns
- Forum Commoner
- Posts: 42
- Joined: Tue Nov 18, 2008 9:51 am
- Location: Buffalo, NY
Re: Quick Line for Duplicates
I need to find the ones that aren't distinct however. 
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Quick Line for Duplicates
I think this will show you how many of each email you have.
Code: Select all
SELECT email,COUNT(email) FROM mailinglist GROUP BY email ORDER BY COUNT(email) DESC(#10850)
Re: Quick Line for Duplicates
I'd spend a little time sorting removing the duplicates manually. Just sort by the field - the duplicates should show up. Then you could set that column as 'UNIQUE' and not have to worry about this again.
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
- SheDesigns
- Forum Commoner
- Posts: 42
- Joined: Tue Nov 18, 2008 9:51 am
- Location: Buffalo, NY
Re: Quick Line for Duplicates
From now on I will have them unique, I got these addresses out of my client's AOL address book that's been sloppily upkept since 2001.
Thank you though.
Thank you though.
Re: Quick Line for Duplicates
Extending upon arborint's idea:
Or an alternative, which may or may not run faster (depending on indexes):
Code: Select all
SELECT email,COUNT(email) AS count FROM mailinglist GROUP BY email HAVING count > 1Code: Select all
SELECT email FROM mailinglist WHERE email NOT IN (SELECT DISTINCT email FROM mailinglist AS m)Re: Quick Line for Duplicates
Great way to remove duplicates:
Code: Select all
CREATE TABLE email_temp AS SELECT * FROM mailinglist WHERE 1 GROUP BY email;
DROP TABLE mailinglist;
RENAME TABLE email_temp TO mailinglist;
Re: Quick Line for Duplicates
This one will not workpytrin wrote:Or an alternative, which may or may not run faster (depending on indexes):Code: Select all
SELECT email FROM mailinglist WHERE email NOT IN (SELECT DISTINCT email FROM mailinglist AS m)
There are 10 types of people in this world, those who understand binary and those who don't
Re: Quick Line for Duplicates
Niceastions wrote:Great way to remove duplicates:
Code: Select all
CREATE TABLE email_temp AS SELECT * FROM mailinglist WHERE 1 GROUP BY email; DROP TABLE mailinglist; RENAME TABLE email_temp TO mailinglist;
But why "WHERE 1" must present in the qyery?
There are 10 types of people in this world, those who understand binary and those who don't
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Quick Line for Duplicates
I knew I left something out.pytrin wrote:Extending upon arborint's idea:Code: Select all
SELECT email,COUNT(email) AS count FROM mailinglist GROUP BY email HAVING count > 1
(#10850)
Re: Quick Line for Duplicates
Actually the having is because count is the result of an aggregate function and this can't be used in the where clause.
oops.. my bad
This one will not workIt will always give an empty row set.
Re: Quick Line for Duplicates
Probably for the GROUP BY clause to work.VladSun wrote:But why "WHERE 1" must present
Re: Quick Line for Duplicates
The GROUP BY clause does not interfere with the WHERE clause (or it presence or absence) - try it.astions wrote:Probably for the GROUP BY clause to work.VladSun wrote:But why "WHERE 1" must present
There are 10 types of people in this world, those who understand binary and those who don't
Re: Quick Line for Duplicates
Cool, it should work either way then.