That seems to be a great idea. wouldnt the hashing still take time if it were thousands of characters long? it would cause a significant load right?(i dont know how fast these commands but im giving my 2centsjshpro2 wrote:No single person runs the site, every member here runs the site. The mods vote on important issues though.a94060 wrote:doesnt d11wtq run the site?
What I would recommend is storing a md5 hash of the message in the database, and then checking if the md5 hash of the current message matches any other message. This will knock out exact dupes but you still have that 10% difference and the issue that jcart brought up. Could you tell us how this is going to be used because there are different algorithms for example soundex that could be used here..
similar_text too slow!
Moderator: General Moderators
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia