Page 1 of 1

finding similar matches

Posted: Fri Jan 19, 2007 6:45 pm
by titaniumdoughnut
Is there a way to query a MySQL database (through PHP) and ask it to use some kind of "similar matching" algorithms or something? I'm basically inventing terms for a technology I imagine might exist right now. Something that would search on a first and last name, and then present similar name combinations or variants of the names searched?

Posted: Fri Jan 19, 2007 8:52 pm
by feyd
Possibly using the SOUNDEX() function.. but really it's often better to have a separate table with similarity information.. even then, it's complicated. I've seen this referred to as fuzzy searching, or similar names, stemming from fuzzy logic.

Posted: Fri Jan 19, 2007 9:48 pm
by Kieran Huggins
also look at the levenshtein() function as well as similar_text() and metaphone().

I've used the levenshtein() distance of two soundex() strings before, but it's expensive resource-wise.

MySQL supports soundex as well, which is nice, but I seem to remember there was some issue about the length of the resulting string or something...

metaphone() is supposedly better than soundex() but MySQL doesn't support it to my knowledge.

SOUNDEX in SQL is faster than php, IIRC. You'll likely want something like:

Code: Select all

SELECT * FROM table WHERE SUBSTRING(SOUNDEX(`haystack`),1,4) = SUBSTRING(SOUNDEX({$needle}),1,4)
ok... I just went and looked it up. I did the above, then further ranked them by comparing the levenshtein() distances of the metaphone() of the two strings in PHP.

Posted: Sun Jan 21, 2007 11:25 am
by titaniumdoughnut
amazing! thanks so much. i'm going to read up on these magical bits of code.