Fastest query to select dupes from two columns?

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
voltrader
Forum Contributor
Posts: 223
Joined: Wed Jul 07, 2004 12:44 pm
Location: SF Bay Area

Fastest query to select dupes from two columns?

Post by voltrader »

I have a database of over 100,000 records.

I want to flag dupes based on latitude and longitude.

I'm thinking of writing a script to loop through all records starting from the 1st; I would compare the latitude and longitude of the 1st record to every other record and do that for all records. This doesn't seem terribly efficient.

Is there a faster way to do this?

Perhaps sort by latitude AND longitude?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

SELECT *, COUNT(`id`) `DUPLICATES` FROM `table` GROUP BY `long`, `lat` HAVING `DUPLICATES` > 1
I believe
ruchit
Forum Commoner
Posts: 53
Joined: Mon Sep 26, 2005 6:03 am

Post by ruchit »

slightly twaeking feyd's query to make it even faster

Code: Select all

SELECT `id`, COUNT(`id`) `DUPLICATES` FROM `table` GROUP BY `id` HAVING `DUPLICATES` > 1
since we only need to find duplicates.. and in using *; the database internally has to identify column names & column count, also there would be overhead involved in retrieving multiple columns & i believe they are are not required in context of this issue.
ryanlwh
Forum Commoner
Posts: 84
Joined: Wed Sep 14, 2005 1:29 pm

Post by ryanlwh »

that would work only if the duplicate records has the same ids... i believe he's asking about finding dupes of lat and long. so i guess feyd was right.
Post Reply