Fuzzy Name Search - Could this be more efficient?

Coding Critique is the place to post source code for peer review by other members of DevNetwork. Any kind of code can be posted. Code posted does not have to be limited to PHP. All members are invited to contribute constructive criticism with the goal of improving the code. Posted code should include some background information about it and what areas you specifically would like help with.

Popular code excerpts may be moved to "Code Snippets" by the moderators.

Moderator: General Moderators

Post Reply
Parantaja
Forum Newbie
Posts: 2
Joined: Wed Jan 13, 2010 7:06 pm

Fuzzy Name Search - Could this be more efficient?

Post by Parantaja »

pickle | Please use [ code=php ], [ code=text ], etc tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read: :arrow: Posting Code in the Forums to learn how to do it too.


Let's say I have a site that houses statistics for pro athletes and I want users to be able to search for an athlete by name. I have created some code that allows for a little margin of error in spelling and it does work. Bu, I'm new to PHP (and programming in general).

Is this code fairly efficient? Or is there a lot of room for improvement?

Code: Select all

 
<?php
mysql_connect("localhost", "--REDACTED--", "--REDACTED--") or die(mysql_error());
mysql_select_db("--REDACTED--") or die(mysql_error());
mysql_query("SET NAMES 'utf8'");
 
$AllMatches = array();
$OtherMatches = array();
$ApproximateMatchDivTop=140;
$ApproximateMatchDivHeight=0;
$OtherMatchDivTop=140;
$OtherMatchDivHeight=0;
$PerfectMatchDivHeight=0;
 
include("z_menu.php");
?>
 
<div style="position:absolute; width:640px; height:30px; top:120px; left:320px">
<form method="post" action="<?php echo $_SERVER["PHP_SELF"]; ?>">
<input name="Mod" type="text" size="40" />
<input name="submit" type="submit" value="submit" />
</form>
</div>
 
 
<?php
if (isset($_POST[submit])) //*** -IF FORM SUBMITTED- ***//
{
$UserIn=mysql_real_escape_string(strip_tags($_POST["Mod"]));
 
 
//***************************  DETERMINE MATCHES  ***************************//
$ArrCount=0;
$result = mysql_query("SELECT DISTINCT `names`.`name`,`names`.`index`,`stats`.`sample`,`stats`.`source` FROM `names`,`stats` WHERE `sample`<>'mybanners/unavailable.jpg' && `stats`.`index`=`names`.`index`");
$modcount = mysql_num_rows($result);
for($count=0; $count<$modcount; $count++)
{ 
  $row = mysql_fetch_array ($result);
  similar_text(strtolower($UserIn), strtolower($row[0]), $percent);
  if ($percent>85)
  {
    if ($ArrCount==0)
    {
      $AllMatches[$ArrCount][0]=round($percent,2);
      $AllMatches[$ArrCount][1]=$row[0];
      $AllMatches[$ArrCount][2]=(int)$row[1];
      $AllMatches[$ArrCount][3]=$row[2];
      $AllMatches[$ArrCount][4]=$row[3];
      $test="unique";
      $ArrCount++;
    }
    if ($ArrCount>0)
    {
      for($subcount=0; $subcount<(count($AllMatches)); $subcount++)
      {
        if (in_array((int)$row[1],$AllMatches[$subcount]))
        {  
          $test="duplicate";
        }
      }
      if ($test=="unique")
      {
        $AllMatches[$ArrCount][0]=round($percent,2);
        $AllMatches[$ArrCount][1]=$row[0];
        $AllMatches[$ArrCount][2]=(int)$row[1];
        $AllMatches[$ArrCount][3]=$row[2];
        $AllMatches[$ArrCount][4]=$row[3];
        $ArrCount++;
      }
      else
      {
        $test="unique";
      }
    }
  }   
}
 
//***************************  SORT MATCHES  ***************************//
if (count($AllMatches)>0)
{
  $count=0;
  $PerfectMatchCount=0;
  foreach ($AllMatches as $key => $row)
  {
    $mod[$key]  = $row[1];
    $match[$key] = $row[0];
    if ($AllMatches[$count][0]==100)
    {
      $PerfectMatchCount++;
    }
    $count++;
  }
  array_multisort($match, SORT_DESC, $mod, SORT_ASC, $AllMatches);
  $ApproximateMatchCount=count($AllMatches)-$PerfectMatchCount;
}
 
//***************************  OTHER POSSIBLE MATCHES  ***************************//
$result=mysql_query("SELECT DISTINCT `names`.`name`,`names`.`index`,`stats`.`sample`,`stats`.`source` FROM `names`,`stats` WHERE `names`.`name` LIKE '".$UserIn." %' && `sample`<>'mybanners/unavailable.jpg' && `stats`.`index`=`names`.`index` ORDER BY `name` ASC ");
$OtherMatchCount=0;
while ($row=mysql_fetch_array($result))
{
  $test="unique";
  for($count=0; $count<(count($AllMatches)); $count++)
  {
    if (in_array((int)$row[1],$AllMatches[$count]))
    {  
      $test="duplicate";
    }
  }
  for($count=0; $count<(count($OtherMatches)); $count++)
  {
    if (in_array((int)$row[1],$OtherMatches[$count]))
    {  
      $test="duplicate";
    }
  }
  if ($test=="unique")
  {
    $OtherMatches[$OtherMatchCount][0]=$row[0];
    $OtherMatches[$OtherMatchCount][1]=(int)$row[1];
    $OtherMatches[$OtherMatchCount][2]=$row[2];
    $OtherMatches[$OtherMatchCount][3]=$row[3];
    $OtherMatchCount++;
  }
  else
  {
    $test="unique";
  }
}
 
//***************************  OUTPUT EXACT MATCHES  ***************************//
if ($PerfectMatchCount!=0)
{
  $PerfectMatchDivHeight=0;
  if ($PerfectMatchCount%6!=0)
  {
    $PerfectMatchDivHeight=285;
  }
  $PerfectMatchDivHeight=($PerfectMatchDivHeight+(floor($PerfectMatchCount/6))*285)+50;
  echo "<div style=\"position:absolute; width:1020px; height:".$PerfectMatchDivHeight."px; top:140px; left:118px\">";
  echo "<h3>Perfect Matches</h3>";
  $top=50;
  $left=0;
  for ($count=0; $count<$PerfectMatchCount; $count++)
  {
    echo "<div style=\"width:170px; height:275px; position:absolute; left:".$left."px; top:".$top."px; overflow:hidden;\">";
    echo $AllMatches[$count][1]."<br />".$AllMatches[$count][4];
   echo "<a href=\"http://www.site.com/z_details.php?name=".$AllMatches[$count][2]."\"><img src=\"z_thumbs/tn_".$AllMatches[$count][3]."\" width=\"100%;\" alt=\"".$AllMatches[$count][0]."\"/></a>\n";
    echo "</div>";
    $left=$left+170;
    if ((($count+1)%6==0)&&($count!=0))
    {
      $top=$top+285;
      $left=0;
    }
  }
  echo "</div>";
}
 
//***************************  OUTPUT APPROXIMATE MATCHES  ***************************//
if ($ApproximateMatchCount!=0)
{
  if ($ApproximateMatchCount%6!=0)
  {
    $ApproximateMatchDivHeight=285;
  }
  $ApproximateMatchDivHeight=($ApproximateMatchDivHeight+(floor($ApproximateMatchCount/6))*285)+50;
  if ($PerfectMatchCount!=0)
  {
    $ApproximateMatchDivTop=$PerfectMatchDivHeight+150;
  }
  echo "<div style=\"position:absolute; width:1020px; height:".$ApproximateMatchDivHeight."px; top:".$ApproximateMatchDivTop."px; left:118px;\">";
  echo "<h3>Approximate Matches</h3>";
  $left=0;
  $top=50;
  for ($count=0; $count<$ApproximateMatchCount; $count++)
  {
    echo "<div style=\"width:170px; height:275px; position:absolute; left:".$left."px; top:".$top."px; overflow:hidden;\">";
    echo $AllMatches[$count+$PerfectMatchCount][1]."<br />".$AllMatches[$count+$PerfectMatchCount][4];
    echo "<a href=\"http://www.site.com/z_details.php?name=".$AllMatches[$count+$PerfectMatchCount][2]."\"><img src=\"z_thumbs/tn_".$AllMatches[$count+$PerfectMatchCount][3]."\" width=\"100%;\" alt=\"".$AllMatches[$count+$PerfectMatchCount][0]."\"/></a>\n";
    echo "</div>";
    $left=$left+170;
    if (($count+1)%6==0)
    {
      $top=$top+285;
      $left=0;
    }
  }
  echo "</div>";
}
 
//***************************  OUTPUT 'STARTS WITH' MATCHES  ***************************//
if ($OtherMatchCount!=0)
{
  if ($OtherMatchCount%6!=0)
  {
    $OtherMatchDivHeight=285;
  }
  $OtherMatchDivHeight=($OtherMatchDivHeight+(floor($OtherMatchCount/6))*285)+50;
  if (($ApproximateMatchCount!=0) || ($PerfectMatchCount!=0))
  {
    $OtherMatchDivTop=$PerfectMatchDivHeight+190+$ApproximateMatchDivHeight;
  }
  echo "<div style=\"position:absolute; width:1020px; height:".$OtherMatchDivHeight."px; top:".$OtherMatchDivTop."px; left:118px;\">";
  echo "<h3>'Begins With' Matches</h3>";
  $left=0;
  $top=50;
  for ($count=0; $count<count($OtherMatches); $count++)
  {
    echo "<div style=\"width:170px; height:275px; position:absolute; left:".$left."px; top:".$top."px; overflow:hidden;\">";
    echo $OtherMatches[$count][0]."<br />".$OtherMatches[$count][3];
    echo "<a href=\"http://www.site.com/z_details.php?name=".$OtherMatches[$count][1]."\"><img src=\"z_thumbs/tn_".$OtherMatches[$count][2]."\" width=\"100%;\" alt=\"".$OtherMatches[0]."\"/></a>\n";
    echo "</div>";
    $left=$left+170;
    if (($count+1)%6==0)
    {
      $top=$top+285;
      $left=0;
    }
  }
  echo "</div>";
}
 
 
mysql_close;
} //*** END -IF FORM SUBMITTED- ***//
 
?>

pickle | Please use [ code=php ], [ code=text ], etc tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read: :arrow: Posting Code in the Forums to learn how to do it too.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Fuzzy Name Search - Could this be more efficient?

Post by pickle »

I think there's potential to do some approximate matching right in the query itself. You may be able to use a FULL TEXT index to find approximate matches or possibly use SOUNDEX
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
Parantaja
Forum Newbie
Posts: 2
Joined: Wed Jan 13, 2010 7:06 pm

Re: Fuzzy Name Search - Could this be more efficient?

Post by Parantaja »

Apologies for not using the proper tags to post my code. I'll try not to let it happen again.

Thanks for the advice regarding soundex. I hadn't thought of that. If anyone else has any suggestions, I'd love to hear them as well.
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Fuzzy Name Search - Could this be more efficient?

Post by VladSun »

Maybe Levenshtein or Damerau-Levenshtein distance...
There are 10 types of people in this world, those who understand binary and those who don't
Post Reply