Duplicate MD5's

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Duplicate MD5's

Post by GeXus »

I have a problem where I'm apparently generating a ton of the same "unique" md5's... This is the code I'm using to try and build a unique ID

Code: Select all

 
$guid = md5(uniqid(mt_rand(), true));
 
I just ran this on my DB to check for dups

Code: Select all

 
SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
 
and I got this...

guid item_id
1676850 0
33162 1
1108 2
1546 3
7056 4
456 6
422 7
5490 11
3716 20
6309 37
4 42
118479 46
333 47
548 48
517 49
71094 56
106 58
761 59
70677 60
667223 61
106 62
234 64
840 67
2200 68
1030 71
3 72
31 73
13 74
6 75
810 81


So as you can see there are a TON of duplicates... is there a better way to handle this?
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Duplicate MD5's

Post by califdon »

What data type is your item_id field? Since the output of MD5() is a string, I'm not sure what happens if you try to store it in a numeric field. I would expect you would get a data type mismatch error.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

There are two columns, guid and item_id that are listed there, item_id is an int and the guid column is varchar
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: Duplicate MD5's

Post by onion2k »

Are you saying that you store $guid, which is the output of the md5() function, and an example of what it returned is "1676850"? That's not possible.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

No, I'm storing the actual md5 string in the guid field... the numbers you see there are from the query above it that ("I think") represents duplicate entries for the field guid...
User avatar
ghurtado
Forum Contributor
Posts: 334
Joined: Wed Jul 23, 2008 12:19 pm

Re: Duplicate MD5's

Post by ghurtado »

Your query is correct, it shows the counts of repeated GUIDs. Can you show us what your longest GUID looks like?
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

Their all 32 characters (i.e., 00000131cfb395a39e4e268732fc77ea)
User avatar
ghurtado
Forum Contributor
Posts: 334
Joined: Wed Jul 23, 2008 12:19 pm

Re: Duplicate MD5's

Post by ghurtado »

So is it possible that the other GUIDs were created at an earlier time, say when your code wasnt working properly? The high numbers of duplicates make it very suspicious.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Duplicate MD5's

Post by califdon »

I'm not so sure that your query is showing what you think it is. You said that your table has a numeric `item_id` field and a varchar `guid` field, but in your query, you alias the count as `guid`:

Code: Select all

SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
I would try this:

Code: Select all

SELECT count(*) AS CountIDs,  guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

This is just a snippet form today, there are a lot more..

Guid count - Item id - created dt

2 179 2008-08-27 18:51:45
2 145 2008-08-27 18:52:32
2 196 2008-08-27 18:53:08
3 0 2008-08-27 18:53:34
2 172 2008-08-27 18:53:37
2 0 2008-08-27 18:53:44
2 179 2008-08-27 18:53:50
2 0 2008-08-27 18:53:52
2 163 2008-08-27 18:54:22
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

califdon wrote:I'm not so sure that your query is showing what you think it is. You said that your table has a numeric `item_id` field and a varchar `guid` field, but in your query, you alias the count as `guid`:

Code: Select all

SELECT count( * ) AS guid, item_id
FROM clicks
GROUP BY item_id
HAVING guid >1
I would try this:

Code: Select all

SELECT count(*) AS CountIDs,  guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.

This returned one result - 421
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Duplicate MD5's

Post by califdon »

GeXus wrote:
califdon wrote:I would try this:

Code: Select all

SELECT count(*) AS CountIDs,  guid
FROM clicks
GROUP BY guid
HAVING CountIDs > 1
and see what you get.
This returned one result - 421
Uhhh, what? That SQL asks to return two columns, CountIDs and guid. How could it return "421"?

[That reminds me of the Hitchhiker's Guide to the Galaxy, where Douglas Adams told the story of a race of hyper-intelligent pan-dimensional beings who built a computer named Deep Thought to calculate the Answer to the Ultimate Question of Life, the Universe, and Everything. When the answer was revealed as 42, Deep Thought predicts that another computer, more powerful than itself would be made and designed by it to calculate the question for the answer. (I had to look that up in Wikipedia to refresh my memory!) ]
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

haha... the 'guid' was blank...
User avatar
Ziq
Forum Contributor
Posts: 194
Joined: Mon Aug 25, 2008 12:43 am
Location: Russia, Voronezh

Re: Duplicate MD5's

Post by Ziq »

Try to execute this SQL:

Code: Select all

DESCRIBE clicks;
What are you get?
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Re: Duplicate MD5's

Post by GeXus »

guid varchar(35) NO MUL

and the rest of the columns, don't want to post them all though ;-)
Post Reply