Page 1 of 1

Finding how much of one string matches another

Posted: Tue May 23, 2006 11:59 pm
by Extremest
I was wondering if it is possible if I have a sentence or subject and was to check it with another subject if there is a way to get it to display if say 90% of it matches. Meaning if all but the last words was the same it would match.

Posted: Wed May 24, 2006 12:01 am
by Burrito
that would require a very sophisticated regular expression.

the alternative would be to store the string you're checking against in a MySQL table with a full text index and run a MATCH AGAINST query off of it.

Posted: Wed May 24, 2006 12:03 am
by Flamie
fast way I can think of:
take the sentence you 're searching, explode() it which will give you an array with the words, then use strpos() in a for loop to check how many words are in your main string.

Posted: Wed May 24, 2006 12:04 am
by Burrito
^^^^

that'd work too :D

Posted: Wed May 24, 2006 12:07 am
by Extremest
ok so then pretty much like do a count of the array to see how many words and then add up how many it finds and do a divide to get a percent...if above this amount then it is a match.....Am I getting that right?

Posted: Wed May 24, 2006 12:10 am
by Extremest
would it possibly be better to find out how many chars there are in the string then just remove say the last 10% and then do a match from there?

Posted: Wed May 24, 2006 12:17 am
by Flamie
Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


[quote="Extremest"]ok so then pretty much like do a count of the array to see how many words and then add up how many it finds and do a divide to get a percent...if above this amount then it is a match.....Am I getting that right?[/quote]

thats the best way...

Code: Select all

$string1 = "hello1 hello2 hello3 hello4";
$string2 = "hello1 hello2 hello3";
$arr = explode($string2, " ");
$numstring2 = count($arr);
$numofmatch = 0;
for($i=0;$i<$numstring2;$i++)
{
     if(strpos($string1, $arr[$i]) !== false)
                 $numofmatch++;
}
$ratio = ($numstring2/$numofmatch)*100;

Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Wed May 24, 2006 12:24 am
by Extremest
ok just came up with another problem...what if there is no space in the subject? They could use dashes or something... Just kinda curious about the whole strlen thing.

Posted: Wed May 24, 2006 12:40 am
by Flamie
Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


explode works with any character

Code: Select all

explode(<string your exploding>, <character that separates the elements)
so for exemple
$string1 = "gggaggaggag";
$arr = explode($string1, "a");
would give you:
$arr[0] = "ggg";
$arr[1] = "gg";
$arr[2] = "gg";
$arr[3] = "g";

Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Wed May 24, 2006 12:48 am
by Extremest
I understand that....it's just that I have a db full of posts and would like to find ones that the subject somewhat matches...only problem is I don't want to have to manually do each set...would like to try and automate it.

Posted: Wed May 24, 2006 12:53 am
by Flamie
Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


ok here's what you do:
So you first have a set of words someone entere right, lets say they are in this variable:

Code: Select all

$search ="string containing a set of words";
$arr = explode($search, " ");
$numofwords = count($arr);
$numofmatches = 0;
$result = mysql_query("SELET * FROM <tablename>");
$num = mysql_num_rows($result); //get the number of rows
for($i=0;$i<$num;$i++) //loop thru them
{
$row = mysql_fetch_object($result); //grab the rows
for($j=0;$j<$numofwords;$j++)
{
 if(strpos($row->post, $arr[$i]) !== false)
          $numofmatch++;
}
//right here you just searched the current post, $numofmatch contains the number of words that the post and the search string have in common
$numofmatches = 0; //set your counter back to 0, then loop again thru the next post.
}

Pimptastic | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Wed May 24, 2006 9:51 am
by feyd
Flamie, please use the syntax highlighting tags.

Extremest, you could just use similar_text() instead.

Posted: Wed May 24, 2006 10:27 am
by Flamie
sorry i will from now on 8(

Posted: Wed May 24, 2006 4:16 pm
by Extremest
Thanks feyd will check it out. Did not know that function existed....but figured that more people than jsut me could use something like that.