Can you spot if one of more fields have the same data?

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
simonmlewis
DevNet Master
Posts: 4435
Joined: Wed Oct 08, 2008 3:39 pm
Location: United Kingdom
Contact:

Can you spot if one of more fields have the same data?

Post by simonmlewis »

We have an issue where one or more fields of the database has the same data in it - ie same product stock codes.
We use this to show tickboxes on screen to select "additional accessories".
Trouble is, for whatever reason we have "mini ***" and then "small ***", which are actually the same product.

I'm not bothered about disabling one to be seen, but moreso creating a simple page that shows those products where there are duplicates.

Such as: "SELECT code1, code2, code3, code4 where <any of them> = <any of them>".

I'm trying to find it in a Google search, but not having luck, as most of these duplicate queries are to see if more than one row has the same entry.

ie. http://stackoverflow.com/questions/9239 ... e-in-mysql
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Can you spot if one of more fields have the same data?

Post by requinix »

Best thing you can do is normalize the table. As you've seen having multiple columns for the same thing can make queries awkward. Really, the current table shouldn't include any codes at all but instead they get moved out to a second table. Each row in the second table matches one row from the first table (ie, using an ID) with one code.
Once that's done you can JOIN the second table against itself where (1) two codes match and (2) they come from different products.

Code: Select all

SELECT l.productID, r.productID, l.code FROM product_codes AS l JOIN product_codes AS r ON l.code = r.code AND l.productID != r.productID
("productID" being the product ID and "product_codes" being the new table)

You can do it without the normalization (which I strongly recommend) with the same idea but a much more complicated JOIN.

Code: Select all

SELECT l.ID, r.ID, l.code1, r.code1, l.code2, r.code2, l.code3, r.code3, l.code4, r.code4
FROM table AS l JOIN table AS r ON (l.code1 IN (r.code1, r.code2, r.code3, r.code4) OR l.code2 IN (r.code1, r.code2, r.code3, r.code4) OR l.code3 IN (r.code1, r.code2, r.code3, r.code4) OR l.code4 IN (r.code1, r.code2, r.code3, r.code4)) AND l.ID != r.ID
Post Reply