Page 1 of 2
Duplicate MD5's
Posted: Wed Aug 27, 2008 4:14 pm
by GeXus
I have a problem where I'm apparently generating a ton of the same "unique" md5's... This is the code I'm using to try and build a unique ID
Code: Select all
$guid = md5(uniqid(mt_rand(), true));
I just ran this on my DB to check for dups
Code: Select all
SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
and I got this...
guid item_id
1676850 0
33162 1
1108 2
1546 3
7056 4
456 6
422 7
5490 11
3716 20
6309 37
4 42
118479 46
333 47
548 48
517 49
71094 56
106 58
761 59
70677 60
667223 61
106 62
234 64
840 67
2200 68
1030 71
3 72
31 73
13 74
6 75
810 81
So as you can see there are a TON of duplicates... is there a better way to handle this?
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 4:33 pm
by califdon
What data type is your item_id field? Since the output of MD5() is a string, I'm not sure what happens if you try to store it in a numeric field. I would expect you would get a data type mismatch error.
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 4:47 pm
by GeXus
There are two columns, guid and item_id that are listed there, item_id is an int and the guid column is varchar
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 4:53 pm
by onion2k
Are you saying that you store $guid, which is the output of the md5() function, and an example of what it returned is "1676850"? That's not possible.
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 5:55 pm
by GeXus
No, I'm storing the actual md5 string in the guid field... the numbers you see there are from the query above it that ("I think") represents duplicate entries for the field guid...
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 6:09 pm
by ghurtado
Your query is correct, it shows the counts of repeated GUIDs. Can you show us what your longest GUID looks like?
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 6:23 pm
by GeXus
Their all 32 characters (i.e., 00000131cfb395a39e4e268732fc77ea)
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 6:39 pm
by ghurtado
So is it possible that the other GUIDs were created at an earlier time, say when your code wasnt working properly? The high numbers of duplicates make it very suspicious.
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 7:24 pm
by califdon
I'm not so sure that your query is showing what you think it is. You said that your table has a numeric `item_id` field and a varchar `guid` field, but in your query, you alias the count as `guid`:
Code: Select all
SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
I would try this:
Code: Select all
SELECT count(*) AS CountIDs, guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 7:27 pm
by GeXus
This is just a snippet form today, there are a lot more..
Guid count - Item id - created dt
2 179 2008-08-27 18:51:45
2 145 2008-08-27 18:52:32
2 196 2008-08-27 18:53:08
3 0 2008-08-27 18:53:34
2 172 2008-08-27 18:53:37
2 0 2008-08-27 18:53:44
2 179 2008-08-27 18:53:50
2 0 2008-08-27 18:53:52
2 163 2008-08-27 18:54:22
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 7:29 pm
by GeXus
califdon wrote:I'm not so sure that your query is showing what you think it is. You said that your table has a numeric `item_id` field and a varchar `guid` field, but in your query, you alias the count as `guid`:
Code: Select all
SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
I would try this:
Code: Select all
SELECT count(*) AS CountIDs, guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.
This returned one result - 421
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 7:42 pm
by califdon
GeXus wrote:califdon wrote:I would try this:
Code: Select all
SELECT count(*) AS CountIDs, guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.
This returned one result - 421
Uhhh, what? That SQL asks to return two columns, CountIDs and guid. How could it return "421"?
[That reminds me of the Hitchhiker's Guide to the Galaxy, where Douglas Adams told the story of a race of hyper-intelligent pan-dimensional beings who built a computer named Deep Thought to calculate the Answer to the Ultimate Question of Life, the Universe, and Everything. When the answer was revealed as 42, Deep Thought predicts that another computer, more powerful than itself would be made and designed by it to calculate the question for the answer. (I had to look that up in Wikipedia to refresh my memory!) ]
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 7:58 pm
by GeXus
haha... the 'guid' was blank...
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 8:24 pm
by Ziq
Try to execute this SQL:
What are you get?
Re: Duplicate MD5's
Posted: Wed Aug 27, 2008 11:15 pm
by GeXus
guid varchar(35) NO MUL
and the rest of the columns, don't want to post them all though
