All queries very fast, except one

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

All queries very fast, except one

Post by pschmehl »

I'm designing an app that does nothing but selects from a pre-populated db. (FreeBSD 8.3, mysql 5.5.22, php 5.4.7). I'm constructing queries based on user selected data from a form.

The basic architecture is an index page, a result page and several includes - view_tables.php which constructs the display, search_form.php which takes in the user selected data using POST and search_functions.php which constructs the queries based on the user selected data. View_tables is an include on the results page and calls the displayTables() function from search_functions.

Here's the db schema:

Code: Select all

CREATE TABLE `argData` (
	`ts` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
	`seq` INT (12) UNSIGNED NOT NULL,
	`stime` CHAR (18) NOT NULL,
	`saddr` VARCHAR (15),
	`sport` SMALLINT (5) UNSIGNED NOT NULL,
	`daddr` VARCHAR (15),
	`dport` SMALLINT (5) UNSIGNED NOT NULL,
	`pkts` MEDIUMINT (8) UNSIGNED NOT NULL,
	`bytes` BIGINT (20) UNSIGNED NOT NULL,
	`state` TINYTEXT ASCII NOT NULL,
	`proto` VARCHAR (15),
	`record` BLOB,
	PRIMARY KEY (`seq`,`stime`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Running queries directly against the db is very fast. For example:

Code: Select all

SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime BETWEEN 1349740800.000000 AND 1350950400.000000; - 0.09 seconds

Code: Select all

SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'; - 0.09 seconds
Running this query takes four times as much time, but still very fast.

Code: Select all

SELECT * FROM argData WHERE stime like '13497%'; - 0.35 seconds
The problem is, all of these queries display in the browser almost instantly except the last one. It takes 95 seconds to complete.

I've run EXPLAIN on the queries. All three are identical:

Code: Select all

mysql> EXPLAIN SELECT * FROM argData WHERE saddr='10.21.22.2' and stime like '13497%';
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | argData | ALL  | NULL          | NULL | NULL    | NULL | 91226 | Using where |
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.00 sec)
I've looked at the query log to see if the query is being altered somehow:

Code: Select all

121023 19:27:09	   20 Query	SELECT * FROM argData WHERE stime like '13497%'
		   21 Query	SELECT * FROM argData WHERE stime like '13497%'
121023 19:41:35	   20 Query	EXPLAIN SELECT * FROM argData WHERE stime like '13497%'
121023 19:42:38	   20 Query	EXPLAIN EXTENDED SELECT * FROM argData WHERE stime like '13497%'
121023 19:46:24	   24 Query	SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'
		   26 Query	SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'
121023 19:54:54	   24 Query	explain select stime from argData where stime like '13497%'
121023 20:19:37	   24 Query	EXPLAIN SELECT * FROM argData WHERE stime like '13497%'
121023 20:20:45	   24 Query	EXPLAIN SELECT * FROM argData WHERE saddr='10.21.22.2' and stime like '13497%'
121023 20:21:34	   24 Query	EXPLAIN SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'
121023 20:22:09	   24 Query	EXPLAIN EXTENDED SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'
121023 20:23:25	   29 Query	SELECT * FROM argData WHERE stime like '13497%'
121023 20:24:51	   29 Query	SELECT * FROM argData WHERE daddr='98.139.112.123' AND saddr BETWEEN '10.0.0.0' AND '10.255.255.255' AND stime like '13497%'
121023 20:39:51	   32 Query	EXPLAIN SELECT * FROM argData WHERE saddr='10.21.22.2' and stime like '13497%'
The stime column is an index and a primary key. This problem doesn't appear to be with the db. It seems to be something in my code, but I can't figure out what. Here's how I'm constructing the queries. The three key var names are $whereStr, $rangeStr and $timeClause. $whereStr is a concatenated string of the saddr, sport, daddr and dport columns like this saddr='x.x.x.x' and daddr='x.x.x.x'. The $rangeStr is a range of network addresses (e.g. BETWEEN 10.0.0.0 and 10.255.255.255.) The $timeClause is a range of dates in unixtime format.

Code: Select all

# Dislay a table of the results of the query
                        if ($timeClause && $rangeStr && $whereStr) {
                                echo "SELECT * FROM $dbtable WHERE $whereStr AND $rangeStr AND $timeClause";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$whereStr} AND {$rangeStr} AND {$timeClause}");
                        }elseif ($timeClause && $whereStr) {
                                echo "SELECT * FROM $dbtable WHERE $wherestr AND $timeClause";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$whereStr} AND {$timeClause}");
                        }elseif ($rangeStr && $whereStr) {
                                echo "SELECT * FROM $dbtable WHERE $whereStr AND $rangeStr";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$whereStr} AND {$rangeStr}");
                        }elseif ($imeClause && $rangeStr) {
                                echo "SELECT * FROM $dbtable WHERE $rangeStr AND $timeClause";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$rangeStr} AND {$timeClause}");
                        }elseif ($rangeStr) {
                                echo "SELECT * FROM $dbtable WHERE $rangeStr";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$rangeStr}");
                        }elseif ($timeClause) {
                                echo "SELECT * FROM $dbtable WHERE $timeClause";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$timeClause}");
                        }else{
                                echo "SELECT * FROM {$dbtable} WHERE {$whereStr}";
                                $query = mysql_query("SELECT * FROM {$dbtable} WHERE {$whereStr}");
                        }
                        if (!$query) {
                                die("Query {$query} failed");
                        }
                        return $query;
I'm not sure where to go from here. Is there a performance measurement tool that can analyze what's going on under the sheets and tell me where the delay is being introduced?
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: All queries very fast, except one

Post by requinix »

You don't have any indexes on the table. You must not have that much data in it for queries to take .09s.
You have a bunch of fields in there: pick a handful that you think will be queried frequently (besides seq and seq+stime which the primary key already covers) and stick an INDEX on them. Taking a guess I'd pick the time and two addresses, and maybe the two ports.

Oh, and the stime shouldn't be a CHAR. It's numeric data so it should go in a numeric field.
And are the addresses, proto, and record data all actually nullable? I wouldn't have expected so.

But I don't see anything wrong with that code snippet, besides the fact that it could be reduced down quite a bit. How about the rest of it?
prodigytech
Forum Newbie
Posts: 2
Joined: Tue Oct 23, 2012 5:35 pm

Re: All queries very fast, except one

Post by prodigytech »

As the prior poster mentioned you need to add an index on any contents where you perform "like" type searches. I would add a full text index if you want to be able to perform searches using partial queries.
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: All queries very fast, except one

Post by Weirdan »

Oh, and the stime shouldn't be a CHAR.
Addresses shouldn't be either. That's what inet_aton(), inet_ntoa() functions are there for.
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

Thanks for all the responses.

The stime field consists of 16 numbers and a period. (E.g. 1234567890.123456.) Can that be an int type even with the period?

I indexed saddr and daddr for test purposes. I was running a short dump of about 90,000 rows, so searching was fast. (Easy to test the code that way.) Indexing dropped some queries from 0.09 secs to 0.00 secs, so it's definitely worthwhile. It didn't however, change the speed of this problematic query.

The problem is that I have no control over the program that's inserting the data into the database. So INET_ATON() isn't available to me unless I patch the code and rebuild the program(which I can certainly do.)

What is "a full text index"? I wasn't aware that there were different types of indexes.

After getting the code working I dumped the table and loaded a day's worth of data. The table is 9.1GB and has over 120,000,000 rows.

Code: Select all

mysql> select count(seq) from argData_2012_10_02;
+------------+
| count(seq) |
+------------+
|  120135349 |
+------------+
1 row in set (1 min 18.17 sec)
Here's the schema. (Previously I was creating the table for test purposes, but the program creates it automatically - one table per day.)
mysql> describe argData_2012_10_02;
+-------+-----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+-------+
| seq | int(10) unsigned | YES | | NULL | |
| stime | double(18,6) unsigned | NO | | NULL | |
| saddr | varchar(64) | NO | | NULL | |
| sport | varchar(10) | NO | | NULL | |
| daddr | varchar(64) | NO | | NULL | |
| dport | varchar(10) | NO | | NULL | |
| pkts | bigint(20) | YES | | NULL | |
| bytes | bigint(20) | YES | | NULL | |
| state | varchar(32) | YES | | NULL | |
| proto | varchar(16) | NO | | NULL | |
+-------+-----------------------+------+-----+---------+-------+
10 rows in set (0.01 sec)
As you can see, nothing is indexed, so I'm going to have to write a script that fixes that after the table is created.

Here's a sample of the data.

Code: Select all

mysql> select * from argData_2012_10_02 limit 10;
+-----------+-------------------+----------------+-------+---------------+-------+------+-------+-------+-------+
| seq       | stime             | saddr          | sport | daddr         | dport | pkts | bytes | state | proto |
+-----------+-------------------+----------------+-------+---------------+-------+------+-------+-------+-------+
| 105807958 | 1349121595.971430 | 10.174.37.162  | 63460 | 68.71.220.62  | 443   |    2 |   120 | A_A   | tcp   |
| 105807959 | 1349121595.971592 | 10.21.16.47    | 53600 | 64.208.241.65 | 80    |   75 | 68566 | PA_PA | tcp   |
| 105807960 | 1349121595.971961 | 129.110.31.40  | 39605 | 217.70.185.0  | 53    |    1 |    87 | INT   | udp   |
| 105807961 | 1349121595.972484 | 74.125.225.134 | 80    | 10.21.1.146   | 61692 |    1 |    60 | _RA   | tcp   |
| 105807962 | 1349121595.973084 | 10.21.16.178   | 53433 | 74.125.225.99 | 80    |    5 |  2519 | PA_PA | tcp   |
| 105807963 | 1349121595.974383 | 10.190.104.195 | 59254 | 66.94.240.25  | 80    |    2 |   120 | A_FA  | tcp   |
| 105807964 | 1349121595.974762 | 10.190.101.245 | 62995 | 75.102.13.140 | 80    |   89 | 48592 | A_PA  | tcp   |
| 105807965 | 1349121595.974790 | 10.110.143.140 | 54014 | 64.4.44.48    | 1863  |    3 |   182 | PA_PA | tcp   |
| 105807966 | 1349121595.974874 | 10.21.21.50    | 15326 | 121.9.201.100 | 17788 |    2 |   241 | INT   | udp   |
| 105807967 | 1349121595.975586 | 10.170.40.197  | 49993 | 70.20.201.40  | 18879 |    8 |   668 | A_PA  | tcp   |
+-----------+-------------------+----------------+-------+---------------+-------+------+-------+-------+-------+
10 rows in set (0.40 sec)
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

Indexing makes a very obvious difference.

Code: Select all

mysql> SELECT * FROM argData_2012_10_02 WHERE saddr='10.21.21.50';

49617 rows in set (1 min 20.41 sec)

mysql> CREATE INDEX saddr_index ON argData_2012_10_02 (saddr) USING BTREE;
Query OK, 120135349 rows affected (17 min 57.77 sec)
Records: 120135349  Duplicates: 0  Warnings: 0

49617 rows in set (0.85 sec)
I'm not sure dramatic is an apt description. Maybe supercalifragilistically awesomely dramatic.

I'm reading up on fulltext indexing and partitioning now. I'm not too excited about doing joins across multple 120 million column tables....

EDIT: So I altered the tables to use fulltext indexing. The query was actually slower, but not by much.

Code: Select all

mysql> alter table argData_2012_10_02 add fulltext(saddr,sport,daddr,dport);
Query OK, 120135349 rows affected (32 min 11.82 sec)
Records: 120135349  Duplicates: 0  Warnings: 0

49617 rows in set (0.91 sec)
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

More testing. I wrapped a query in a microtime statement, like this;
$time=microtime(true);
$query = do some stuff;
$time=microtime(true)-$time;
echo "Query took ".$time." seconds."

The delay is apparently in php or apache or the network (unlikely), because the query time for one quick query I ran was .49 seconds, but the browser was tied up for 39 seconds.

I then wrapped the table that renders the results in a similar time statement. Here's the results:
Query took 0.46583485603333 seconds.
Total time for rendering the table was 53.010699987411 seconds.

Running the query directly in the db results in this:
49617 rows in set (0.43 sec).

So there appears to be a 3 second latency between running the query and the end of its run. But the rendering of the results took more than 10 times as long as the query.

It appears the problem might be in my code. Here it is:
# cat includes/view_tables.php

Code: Select all

<?php
include_once('includes/search_functions.php');
#echo var_dump($_POST);
	$queryStart=microtime(true);
	$query=displayTable();
	$fields_num = mysql_num_fields($query);

	echo "<h2>Table: {$dbtable}</h2>";
	echo "<table>";
	echo "<tr style=\"font-weight:bold\">";
	// printing table headers
	for($i=0; $i<$fields_num; $i++)
	{
		$field = mysql_fetch_field($query);
		switch ($field->name) {
			case "seq" :
				echo "<td>Argus Sequence Number</td>";
				continue 2;
			case "stime" :
				echo "<td>Packet Start Time</td>";
				continue 2;
			case "saddr" :
				echo "<td>Source IP</td>";
				continue 2;
			case "sport" :
				echo "<td>Src Port</td>";
				continue 2;
			case "daddr" :
				echo "<td>Destination IP</td>";
				continue 2;
			case "dport" :
				echo "<td>Dest Port</td>";
				continue 2;
			case "pkts" :
				echo "<td>Total Packets</td>";
				continue 2;
			case "bytes" :
				echo "<td>Total Bytes</td>";
				continue 2;
			case "state" :
				echo "<td>Packet Flags</td>";
				continue 2;
			case "proto" :
				echo "<td>Protocol</td>";
				continue 2;
		}
		echo "<td></td>";
	}
	echo "</tr>\n";
	// printing table rows
	while($row = mysql_fetch_row($query))
	{
		echo "<tr>";

	// $row is array... foreach( .. ) puts every element
	// of $row to $cell variable
	foreach($row as $key => $cell) {
		if($key==1) {
		$HRDate=date("n/j/Y h:m:s",$cell);
                echo "<td>$HRDate</td>";
            }else{
		echo "<td>$cell</td>";
	    }
	}
	echo "</tr>\n";
	}
	echo "</table>\n";
	$queryStart=microtime(true)-$queryStart;
	echo "<br>Total time for rendering the table was ".$queryStart." seconds<br>";
	mysql_free_query($query);
Is there a faster way to loop through the code and render the results?

Or could this be related to php memory settings? Cache settings?
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: All queries very fast, except one

Post by requinix »

I don't see anything particularly nasty in there. How long does displayTable() by itself take? How about the part that prints field names, or the mysql_fetch_row() loop?
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

requinix wrote:I don't see anything particularly nasty in there. How long does displayTable() by itself take? How about the part that prints field names, or the mysql_fetch_row() loop?
Query took 0.46391606330872 seconds
Elapsed time before fields_num is 0.4641900062561 seconds.
Elasped time from start of header display is 0.064227104187012 seconds.
Total time for rendering the rows was 32.84171795845 seconds
The elapsed time for the entire page was 33.370176076889 seconds

So almost all the time is consumed rendering the table rows. Is there a better way to do that?
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

I commented out the while and foreach loops and did this:

Code: Select all

while($row = mysql_fetch_row($query)) {
                echo var_dump($row);
        }
It still took 22 seconds to return.
Adding a break between lines added an additional 3 seconds. That's just 20 seconds less than displaying the entire table.

Code: Select all

while($row = mysql_fetch_row($query)) {
                echo var_dump($row);
                echo "<br>";
        }
I guess there's a price to pay for rendering that I can't get around.

This makes me wonder how mysql can display the result in a table so much faster. Is there a way to simply display the mysql rendered table in a <pre> tag? Looks matter, but performance matters more.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: All queries very fast, except one

Post by requinix »

If I read your posts right, you're rendering some ~50,000 rows of content?

For kicks, what if you use an unbuffered query?
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

requinix wrote:If I read your posts right, you're rendering some ~50,000 rows of content?
Yes, and searching a 120 million + row table.
requinix wrote:For kicks, what if you use an unbuffered query?
I'll try that tomorrow. I'm also going to try this: PHP/MySQL Tutorial - Learn how to display MySQL table data. Given what I read there, if I understand it correctly, the query results are already in the array $query within 46 seconds. The rest of the time is spent processing the rows and displaying them.

Anyway, tomorrow's another day.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: All queries very fast, except one

Post by requinix »

That tutorial won't help this problem. Besides that, please don't follow it: deprecated HTML 4 tags, invalid PHP syntax, SQL injection, the error suppression operator, "or die()"-style error messages, and it uses the impractical mysql_result().

Do you really need to show all 50K records on one page? That's seems entirely unreasonable.
pschmehl
Forum Newbie
Posts: 14
Joined: Fri Oct 19, 2012 8:57 pm

Re: All queries very fast, except one

Post by pschmehl »

requinix wrote:That tutorial won't help this problem. Besides that, please don't follow it: deprecated HTML 4 tags, invalid PHP syntax, SQL injection, the error suppression operator, "or die()"-style error messages, and it uses the impractical mysql_result().
I was referring strictly to using the while()for i=0, etc. loop. I thought it might possibly be faster than the while() foreach() loop. It seemed worth testing at least.
requinix wrote:Do you really need to show all 50K records on one page? That's seems entirely unreasonable.
I hadn't thought of splitting it up into multiple pages. I would have no clue how to do that. I do need to make all the results available to the viewer. These records will be viewed by security analysts (that's what I am), and we need to see all the results, because you never know where the important information might be in all those records.

So all the results need to be available, but they wouldn't have to be on the same page. Is there a way to paginate during the foreach loop? I guess I could use $num to determine how many records to display on a page, but then how would I send the following rows to a different page?

This project reveals why I don't like programming. Every time you think you're done, another issue comes up that needs to be addressed. And many times that means essentially scrapping your code and starting over (which I've done several times already with this one.)
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: All queries very fast, except one

Post by requinix »

The magic search term is "pagination".

You actually do it in the query. For MySQL,

Code: Select all

SELECT ... LIMIT 0, 100
will start at offset 0 and pick 100 records. LIMIT 100,100 would be the next page of results.

Code: Select all

offset = (page number - 1) * number per page
Determining the number of pages means knowing how many total records there are. Essentially repeat the query but only with a COUNT() and no LIMIT.

Code: Select all

SELECT COUNT(1) FROM table...
There are other methods but my anecdotal evidence says a COUNT() is the best.

The loop remains the same. The only thing left to deal with is the navigation:
* Link for the first and previous pages (if they're not on the first page)
* Links for pages around where they are now (don't forget about the edges at 1 and the number of pages). You've got flexibility for which pages to show but +/- 2 or 3 is typical
* Link to the next and last pages (if they're not on the last page)
Post Reply