Page 1 of 1

Help Optimizing This php/mysql code

Posted: Wed Jul 18, 2007 1:15 am
by cobraroll98
feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


I have always considered myself a 'functional' coder in that I can build proof of concept projects but  I currently find myself (while proving a concept) to be a bit over my head.  

I have  a search engine that searches against 1million  items in a mysql 5.x database.  Currently the search is only against one indexed field 'title'.  I am trying to squeeze every ounce of speed out of the query / display of the search results (code below).  Does anyone have any pointers or sites to help me improve this code?
Thanks!!!!
-Andy

Code: Select all

<?php

//catch GET search value
if(isset($_GET['q'])) {
	$search_string = preg_replace("/[^a-zA-Z0-9\ s]/", "", $_GET['q']);
	
//connect to db
include('Connections/yop.php');
require_once('funcs.mysql.boolean.php');
$fulltext_key = 'title';

mysql_select_db($mysqldb,$db_connect);

$rowsPerPage = 10; // how many rows to show per page
$pageNum = 1; // by default we show first page

// if $_GET['page'] defined, use it as page number
if(isset($_GET['page'])) 
{
	$pageNum = $_GET['page'];
}

// counting the offset
$offset = ($pageNum - 1) * $rowsPerPage;

// the boolean functions take the search value passed through GET and cleans, and builds an 'and' 'or' search string. 
//I believe I read that there is now a way to do this with sql??
$query = "SELECT recipes.title, recipes.url, recipes.description,recipes.id,sources.name,\n"
			.boolean_sql_select(
				boolean_inclusive_atoms($search_string),
				$fulltext_key)." as relevance \n"
			."FROM recipes \n"
			."Inner Join sources ON recipes.sid = sources.id \n"
			."WHERE \n"
			.boolean_sql_where($search_string,$fulltext_key)." \n"
			."HAVING relevance>0 \n"
			."ORDER BY relevance DESC \n"	
			."LIMIT $offset, $rowsPerPage";

//a real life execution of the above query may look like this:

/*SELECT recipes.title, recipes.url, recipes.description,recipes.id,sources.name, DATEDIFF(NOW(),recipes.date_added) as howlong, match (title) against (' marinade ') as relevance FROM recipes Inner Join sources ON recipes.sid = sources.id WHERE match (title) against ('marinade')>0 HAVING relevance>0 ORDER BY relevance DESC LIMIT 0, 10
*/

$result = mysql_query($query) or die(mysql_error().$sql);
$num_rows = mysql_num_rows($result);
    if($num_rows == NULL)
		echo "No rows returned by query";
    else
		$row_searchResults = mysql_fetch_assoc($result) or die(mysql_error().$sql);




$html = "<table width='500' border='0' cellpadding='0' cellspacing='0'>";


if ($num_rows > 0) {
	do {

	$clean_name = str_replace('II','',preg_replace("/[^a-zA-Zs\ ]/", " ", $row_searchResults['title']));
	
	$html .= "<tr><td width='100%' height='21' valign='top'><span class='r_name'>";
	$html .= "<a href=".$row_searchResults['url']." class=lightwindow page-options >".$row_searchResults['title']."</a>";
	$html .= "</span><span class='r_desc'> From ".$row_searchResults['name']." <a href='".$PHP_SELF."?q=".$clean_name."'>LikeThis</a></span></td></tr>";
	
	$html .= " <tr><td height='18' valign='top' style='padding-top:5px; padding-bottom:7px';><span class='r_tags'>";
	$html .= "<a href='#'>breakfast</a> <a href='#'>chicken</a> <a href='#'>glazed</a> <a href='#'>baked</a></span></td></tr>";
	if($row_searchResults['description'] <> '') {
	$html .= "<tr><td height='18' valign='top' class='r_desc'>".$row_searchResults['description']."...</td></tr>";
	}
	$html .= "<tr><td><br></td></tr>";
  
	
	} while ($row_searchResults = mysql_fetch_assoc($result));
	$html .= "</table>";
} else {
	$html = "No results";

}

////////////////////////////////////
//HANDLE THE PAGING OF THE RESULTS
///////////////////////////////////

$query   = "SELECT COUNT(*) AS numrows FROM recipes WHERE ".boolean_sql_where($search_string,$fulltext_key);
			
//$query  = "SELECT COUNT(*) FROM recipes where title like '%salmon%' LIMIT $offset, $rowsPerPage";
$result  = mysql_query($query) or die($query);
$row     = mysql_fetch_array($result, MYSQL_ASSOC);
$numrows = $row['numrows'];
$search = "q=".$search_string."&";
   
	
	
// how many pages we have when using paging?
$maxPage = ceil($numrows/$rowsPerPage);

$self = $_SERVER['PHP_SELF'];

// creating 'previous' and 'next' link
// plus 'first page' and 'last page' link

// print 'previous' link only if we're not
// on page one
if ($pageNum > 1)
{
	$page = $pageNum - 1;
	$prev = " <a href=\"$self?".$search."page=$page\">Prev</a> ";
	
	$first = " <a href=\"$self?".$search."page=1\">First</a> ";
} 
else
{
	$prev  = '';       // we're on page one, don't enable 'previous' link
	$first = ''; // nor 'first page' link
}

// print 'next' link only if we're not
// on the last page


if ($pageNum < $maxPage)
{
	$page = $pageNum + 1;
	$next = " <a href=\"$self?".$search."page=$page\">Next</a> ";
	
	$last = " <a href=\"$self?".$search."page=$maxPage\">Last</a> ";
} 
else
{
	$next = '';      // we're on the last page, don't enable 'next' link
	$last = ''; // nor 'last page' link
}

// print the page navigation link
$html .= "<span class='t1'>".$first . $prev . " Showing page <strong>$pageNum</strong> of <strong>$maxPage</strong> pages " . $next . $last."</span>";




?>

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Wed Jul 18, 2007 11:49 pm
by Benjamin
The code is a bit messy, but I don't see any bad bottle necks. You might want to look into stored procedures or attempt to generate the pagination without a separate query. I don't know how big your result sets are though. Is it running slow?

Just FYI you don't need to add line feeds to your database queries.

Posted: Thu Jul 19, 2007 8:53 pm
by cobraroll98
Thanks for the input! The database is going to be over a millions records soon, currently speed is fine but I'm not sure what will happen once the public starts to pound on it. I had planned to switch to stored procs soon
in theory that should increase my speed as well as protect the query from hacks right?

Thanks Again for your great feedback!

Posted: Fri Aug 10, 2007 7:48 am
by dancaragea
Procedures might help you a bit but not much. The main query which orders by relevance will create temporary tables and there's nothing you can do about it.
Make sure you have a fulltext key on recipes.title.
Make sure your sources.id is primary key OR index but not both.
Remove the having relevance>0 part. It is always > 0 and it slows you down.

Now the count query:
Why does it have a LIMIT offset,result? It shouldn't since it is supposed to get all rows.
Why does it have a LIKE when the main query has a MATCH AGAINST?
The only advice for this one is to stay away from LIKE conditions starting with %. You could get away with a `field` LIKE 'abcd%' if you put an index on `field` but if it starts with % there's nothing you can do, it will be slow as hell.

And one more thing: save the result of the count query in session or somewhere else. This way you'll only read the count once and you will reuse it on the next pages.