Quick Line for Duplicates

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
SheDesigns
Forum Commoner
Posts: 42
Joined: Tue Nov 18, 2008 9:51 am
Location: Buffalo, NY

Quick Line for Duplicates

Post by SheDesigns »

I have a mailing list of over 500 records, some of which, I know are duplicates. I don't have time to through an monitor them all. The table only has one field so far "address".

How would I write a query to find the entries that are duplicate?


SELECT * FROM `mailinglist` WHERE address ....
User avatar
infolock
DevNet Resident
Posts: 1708
Joined: Wed Sep 25, 2002 7:47 pm

Re: Quick Line for Duplicates

Post by infolock »

Code: Select all

 
$sql = "SELECT distinct `address` from `mailinglist`";
 
User avatar
SheDesigns
Forum Commoner
Posts: 42
Joined: Tue Nov 18, 2008 9:51 am
Location: Buffalo, NY

Re: Quick Line for Duplicates

Post by SheDesigns »

I need to find the ones that aren't distinct however. :|
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Quick Line for Duplicates

Post by Christopher »

I think this will show you how many of each email you have.

Code: Select all

SELECT email,COUNT(email) FROM mailinglist GROUP BY email ORDER BY COUNT(email) DESC
(#10850)
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Quick Line for Duplicates

Post by pickle »

I'd spend a little time sorting removing the duplicates manually. Just sort by the field - the duplicates should show up. Then you could set that column as 'UNIQUE' and not have to worry about this again.
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
User avatar
SheDesigns
Forum Commoner
Posts: 42
Joined: Tue Nov 18, 2008 9:51 am
Location: Buffalo, NY

Re: Quick Line for Duplicates

Post by SheDesigns »

From now on I will have them unique, I got these addresses out of my client's AOL address book that's been sloppily upkept since 2001.

Thank you though.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Quick Line for Duplicates

Post by Eran »

Extending upon arborint's idea:

Code: Select all

SELECT email,COUNT(email) AS count FROM mailinglist GROUP BY email HAVING count > 1
Or an alternative, which may or may not run faster (depending on indexes):

Code: Select all

SELECT email FROM mailinglist WHERE email NOT IN (SELECT DISTINCT email FROM mailinglist AS m)
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Quick Line for Duplicates

Post by Benjamin »

Great way to remove duplicates:

Code: Select all

 
CREATE TABLE email_temp AS SELECT * FROM mailinglist WHERE 1 GROUP BY email;
DROP TABLE mailinglist;
RENAME TABLE email_temp TO mailinglist;
 
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Quick Line for Duplicates

Post by VladSun »

pytrin wrote:Or an alternative, which may or may not run faster (depending on indexes):

Code: Select all

SELECT email FROM mailinglist WHERE email NOT IN (SELECT DISTINCT email FROM mailinglist AS m)
This one will not work :) It will always give an empty row set.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Quick Line for Duplicates

Post by VladSun »

astions wrote:Great way to remove duplicates:

Code: Select all

 
CREATE TABLE email_temp AS SELECT * FROM mailinglist WHERE 1 GROUP BY email;
DROP TABLE mailinglist;
RENAME TABLE email_temp TO mailinglist;
 
Nice :)
But why "WHERE 1" must present in the qyery?
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Quick Line for Duplicates

Post by Christopher »

pytrin wrote:Extending upon arborint's idea:

Code: Select all

SELECT email,COUNT(email) AS count FROM mailinglist GROUP BY email HAVING count > 1
I knew I left something out. ;) Using HAVING is necessary instead of WHERE because HAVING operates after the GROUP BY, whereas WHERE operates before the GROUP BY. Important to remember, and very handy. As ever pytrin...excellent!
(#10850)
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Quick Line for Duplicates

Post by Eran »

Actually the having is because count is the result of an aggregate function and this can't be used in the where clause.
This one will not work :) It will always give an empty row set.
:lol: oops.. my bad
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Quick Line for Duplicates

Post by Benjamin »

VladSun wrote:But why "WHERE 1" must present
Probably for the GROUP BY clause to work.
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Quick Line for Duplicates

Post by VladSun »

astions wrote:
VladSun wrote:But why "WHERE 1" must present
Probably for the GROUP BY clause to work.
The GROUP BY clause does not interfere with the WHERE clause (or it presence or absence) - try it.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Quick Line for Duplicates

Post by Benjamin »

Cool, it should work either way then.
Post Reply