Remove Duplicates
Moderator: General Moderators
Remove Duplicates
Is there a query I can run to remove duplicate rows from a database table? The rows would be identical except for the primary key id. I could write a script for it but I'm afraid I would mess it up.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Code: Select all
mysql> create table example (id int auto_increment primary key, col1 varchar(10), col2 varchar(10), col3 varchar(10)); Query OK, 0 rows affected (0.03 sec)
mysql> insert into example (col1, col2, col3) values ('one', 'two', 'three'); Query OK, 1 row affected (0.00 sec)
mysql> insert into example (col1, col2, col3) values ('one', 'four', 'three'); Query OK, 1 row affected (0.00 sec)
mysql> insert into example (col1, col2, col3) values ('one', 'two', 'three'); Query OK, 1 row affected (0.00 sec)
mysql> insert into example (col1, col2, col3) values ('one', 'two', 'four'); Query OK, 1 row affected (0.00 sec)
mysql> insert into example (col1, col2, col3) values ('four', 'two', 'three'); Query OK, 1 row affected (0.00 sec)
mysql> select * from example;
+----+------+------+-------+
| id | col1 | col2 | col3 |
+----+------+------+-------+
| 1 | one | two | three |
| 2 | one | four | three |
| 3 | one | two | three |
| 4 | one | two | four |
| 5 | four | two | three |
+----+------+------+-------+
5 rows in set (0.00 sec)
mysql> select distinct col1, col2, col3 from example;
+------+------+-------+
| col1 | col2 | col3 |
+------+------+-------+
| one | two | three |
| one | four | three |
| one | two | four |
| four | two | three |
+------+------+-------+
4 rows in set (0.00 sec)
mysql> create table example_copy (id int auto_increment primary key, col1 varchar(10), col2 varchar(10), col3 varchar(10));
Query OK, 0 rows affected (0.02 sec)
mysql> insert into example_copy (col1, col2, col3) select distinct col1, col2, col3 from example;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from example_copy;
+----+------+------+-------+
| id | col1 | col2 | col3 |
+----+------+------+-------+
| 1 | one | two | three |
| 2 | one | four | three |
| 3 | one | two | four |
| 4 | four | two | three |
+----+------+------+-------+
4 rows in set (0.00 sec)
mysql>
I think I might have a problem with this, because my mysql version doesn't support subqueries. but i'll give it a try and let you know the results.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Worked perfectly. Thanks 
I also went command line with this. go 1337 me.
I also went command line with this. go 1337 me.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
I had written a script before using PHP to try to loop through the records, check for a dupe, then insert into the new record if it wasn't found.
Ended up my script was flawed and I lost a few thousand rows and had to restore a backup table.
Doing it this way was quick (like 1.7 seconds) and painless (no script writing!).
Ended up my script was flawed and I lost a few thousand rows and had to restore a backup table.
Doing it this way was quick (like 1.7 seconds) and painless (no script writing!).
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.