Page 1 of 1

Large Data Set + MySQL + PHP = SLOWWWWW. Help!

Posted: Tue Mar 27, 2012 3:41 pm
by amateurhour
I just hacked together down and dirty query tool to generate a single composite report of data held from three tables in a MySQL database.

There are 500,000+ records.

Being an amateur programmer, simple mysql_query routines are not cutting it from a performance standpoint. Haven't spent the time to re-code using mysql_buffered_query. Conceivably I could map all records into a single table, but I'm not sure that would help.

Using a single form to capture SearchBy and SearchString data. Posting it to a PHP page to process and report on screen.

The output is super slow. Even boosting the max_execution and max_input environment variables in PHP.ini doesn't resolve (I'm patient and can wait).

functions.inc.php
--------------------------------------------------------------

Code: Select all


<?

/////////////////////////////////////
// VARIABLES
/////////////////////////////////////

$server   	= "localhost";
$username 	= "root";
$password 	= "";
$database 	= "5500_data";
$link     	= mysql_connect($server,$username,$password);
$db_selected	= mysql_select_db($database,$link);

/////////////////////////////////////


function buildquery($searchby,$searchstring)
{

global $server, $username, $password, $database, $link, $db_selected;


	if ($searchby == "Plan")
	{
		$query	= "SELECT ACK_ID FROM form5500 WHERE PLAN_NAME LIKE '%$searchstring%'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby == "Auditor")
	{
		$query	= "SELECT ACK_ID FROM scheduleh WHERE ACCOUNTANT_FIRM_NAME LIKE '%$searchstring%'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby == "Platform")
	{
		$query	= "SELECT ACK_ID FROM schedulec1 WHERE PROVIDER_ELIGIBLE_NAME LIKE '%$searchstring%'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby == "ZIP")
	{
		$query	= "SELECT ACK_ID FROM form5500 WHERE SPONS_DFE_MAIL_US_ZIP = '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby = "AssetsAbove")
	{
		$query	= "SELECT ACK_ID FROM scheduleh WHERE TOT_ASSETS_EOY_AMT >= '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby = "AssetsBelow")
	{
		$query	= "SELECT ACK_ID FROM scheduleh WHERE TOT_ASSETS_EOY_AMT <= '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby = "EIN_Plan")
	{
		$query	= "SELECT ACK_ID FROM form5500 WHERE SPONS_DFE_EIN = '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby = "EIN_Auditor")
	{
		$query	= "SELECT ACK_ID FROM scheduleh WHERE ACCOUNTANT_FIRM_EIN = '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}
	else if ($searchby = "EIN_Platform")
	{
		$query	= "SELECT ACK_ID FROM schedulec1 WHERE PROVIDER_ELIGIBLE_EIN = '$searchstring'";
		$result	= mysql_query($query,$link) or die(mysql_error());
	}

return $result;

}



function print_5500data($ackid)
{

global $server, $username, $password, $database, $link, $db_selected;

	$query	= "SELECT * FROM form5500 WHERE ACK_ID = '$ackid' ";
	$result	= mysql_query($query,$link) or die(mysql_error());

	$form5500data = "";

	while($row = mysql_fetch_array($result))
	{
		$form5500data =		"<tr><td>" . $row["PLAN_NAME"] 			. "</td>" .
					"<td>" . $row["SPONS_DFE_MAIL_US_ADDRESS1"] 	. "</td>" . 
					"<td>" . $row["SPONS_DFE_MAIL_US_ADDRESS2"] 	. "</td>" .
					"<td>" . $row["SPONS_DFE_MAIL_US_CITY"]		. "</td>" . 
					"<td>" . $row["SPONS_DFE_MAIL_US_STATE"]	. "</td>" .
					"<td>" . $row["SPONS_DFE_MAIL_US_ZIP"]		. "</td>" .
					"<td>" . $row["SPONS_SIGNED_NAME"]		. "</td>" .
					"<td>" . $row["DFE_SIGNED_NAME"]		. "</td>" .
					"<td>" . $row["TOT_PARTCP_BOY_CNT"]		. "</td>";
	}

echo $form5500data;

}


function print_schedulehdata($ackid)
{

global $server, $username, $password, $database, $link, $db_selected;

	$query	= "SELECT * FROM scheduleh WHERE ACK_ID = '$ackid' ";
	$result	= mysql_query($query,$link) or die(mysql_error());

	$schedulehdata = "";

	while($row = mysql_fetch_array($result))
	{
		$schedulehdata =	"<td>" . $row["TOT_ASSETS_EOY_AMT"]		. "</td>" .
					"<td>" . $row["TOT_CONTRIB_AMT"] 		. "</td>" . 
					"<td>" . $row["ACCOUNTANT_FIRM_NAME"]		. "</td>";
	}

echo $schedulehdata;

}


function print_schedulec1data($ackid)
{

global $server, $username, $password, $database, $link, $db_selected;

	$query	= "SELECT * FROM schedulec1 WHERE ACK_ID = '$ackid' ";
	$result	= mysql_query($query,$link) or die(mysql_error());

	$schedulec1data = "";

	while($row = mysql_fetch_array($result))
	{
		$schedulec1data =	"<td>" . $row["PROVIDER_ELIGIBLE_NAME"]		. "</td></tr>";
	}

echo $schedulec1data;

}

?>

planlist.php
--------------------------------------------------------------

Code: Select all


<html>
<head>
<title>DOL/EBSA Form 5500 Data Query Engine - Plan List</title>

</head>

<body bgcolor=#d4d0c8>

<h3>DOL/EBSA Form 5500 Data Query Engine</h3>

<hr size=0 noshade>

<?

	include("functions.inc.php"); 

?>

<hr size=0 noshade>


	<table border=0 cellpadding=0 cellpadding=0 width=100%>
	<tr>
		<td valign=top align=left>
	
			<table border=0 cellpadding=0 cellspacing=0 width=100%>
			<tr>
				<td valign=top align=middle>

					<table border=0 cellpadding=0 cellspacing=1 width=100%>
					<tr>
						<td align=left><b>Plan Name</b></td>
						<td align=left><b>Address1</b></td>
						<td align=left><b>Address2</b></td>
						<td align=left><b>City</b></td>
						<td align=left><b>State</b></td>
						<td align=left><b>ZIP</b></td>
						<td align=left><b>Signer1</b></td>
						<td align=left><b>Signer2</b></td>
						<td align=left><b># Participants</b></td>
						<td align=left><b>EOY Assets</b></td>
						<td align=left><b>Flows</b></td>
						<td align=left><b>Auditor</b></td>
						<td align=left><b>Platform</b></td>
					</tr>
					<tr>
						<td colspan=13><hr size=0 noshade></td>
					</tr>


<?

	$result = buildquery($_POST["SearchBy"],$_POST["SearchString"]);


	while($row = mysql_fetch_array($result))
	{
		$ackid = $row['ACK_ID'];

		print_5500data($ackid);
		print_schedulehdata($ackid);
		print_schedulec1data($ackid);
	}

?>



					</table>
				</td>
			</tr>
			</table>
		</td>
	</tr>
	</table>


<hr size=0 noshade>

<br clear=all><br clear=all>

<font size=2>Source:  DOL/EBSA 2010</font>

<br><br>

<input type="button" value="Back to Data" onclick="javascript:history.go(-1)">
&nbsp;&nbsp;
<input type="button" value="Print" onclick="javascript:window.print()"><br clear=all><br>


</body>
</html>


Re: Large Data Set + MySQL + PHP = SLOWWWWW. Help!

Posted: Tue Mar 27, 2012 6:08 pm
by Celauran
Are you sure the bottleneck is PHP? Have you run your queries manually?

Re: Large Data Set + MySQL + PHP = SLOWWWWW. Help!

Posted: Wed Mar 28, 2012 9:17 am
by mikosiko
there are several considerations to have in mind here,

starting for the fact that none of this queries will use an index, therefore that will be your first bottleneck depending on the table's size

Code: Select all

$query  = "SELECT ACK_ID FROM form5500 WHERE PLAN_NAME LIKE '%$searchstring%'";
$query  = "SELECT ACK_ID FROM scheduleh WHERE ACCOUNTANT_FIRM_NAME LIKE '%$searchstring%'";
$query  = "SELECT ACK_ID FROM schedulec1 WHERE PROVIDER_ELIGIBLE_NAME LIKE '%$searchstring%'";
then, in this queries, the usage of an index or not will depend obviously first if the index exists, and then if the optimizer can use it without making any type conversion in case the columns are dissimilar.

Code: Select all

$query  = "SELECT ACK_ID FROM scheduleh WHERE TOT_ASSETS_EOY_AMT <= '$searchstring'";
$query  = "SELECT ACK_ID FROM form5500 WHERE SPONS_DFE_EIN = '$searchstring'";
$query  = "SELECT ACK_ID FROM scheduleh WHERE ACCOUNTANT_FIRM_EIN = '$searchstring'";
$query  = "SELECT ACK_ID FROM schedulec1 WHERE PROVIDER_ELIGIBLE_EIN = '$searchstring'";
Reference for both situations could be found here http://dev.mysql.com/doc/refman/5.0/en/ ... dexes.html

So, as Celauran suggested your first step is evaluate the performance of those queries alone... use EXPLAIN to do that.

Next, your code evidently is doing queries inside loops (bad idea in general); functions print_5500data(), print_schedulehdata() and print_schedulecldata() are called probably 100's of time from planlist.php. Because we don't know exactly your goals we can't offer more adjusted advice, however seems to me that you need to redesign completely your code and use JOINS in your SELECTS to eliminate hitting the DB multiples times in loops.

and lastly, please.. drop all those global variables ... if you need to use variables in the function's scope then pass those values as arguments for the function.