Query takes longer and longer...

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
Todd_Z
Forum Regular
Posts: 708
Joined: Thu Nov 25, 2004 9:53 pm
Location: U Michigan

Query takes longer and longer...

Post by Todd_Z »

For some reason the following query is fast when t_relation is empty, then when t_relation starts to get populated, it gets slower and slower and slower. For example, when the size of the database is 2500, the query will take 3 minutes. When the size is 5000, it takes over 7 minutes. Is there a way to optimize this?

The reason for the limit 2500 is that I want each iteration of the script to work very quickly, independent of the size of the database. When I first run this script to calculate 5000 address against 120 t-stops [t=Trolley], I want it to take a couple iterations.

The script is on a crontab running every 5 minutes, most times there will be nothing to do, but if an address is added, then obviously, it would calculate it.

Code: Select all

SELECT address.id AS aid, t_station.id AS tid FROM address, t_station WHERE address.status = 'Active' AND 
t_station.status = 'Active' AND ( SELECT COUNT(*) FROM t_relation WHERE t_relation.status = 'Active' AND address_id = address.id AND t_station_id = t_station.id AND MD5(CONCAT_WS('-',address.latitude,address.longitude)) = md5_address AND MD5(CONCAT_WS('-',t_station.latitude,t_station.longitude)) = md5_station ) = 0 LIMIT 2500

Code: Select all

CREATE TABLE `t_relation` (
  `t_station_id` int(2) unsigned default NULL,
  `md5_station` varchar(32) default NULL,
  `address_id` int(36) default NULL,
  `md5_address` varchar(32) default NULL,
  `miles` double default NULL,
  `timestamp` datetime default NULL,
  `status` enum('Active','Disabled') NOT NULL,
  `id` int(16) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Code: Select all

CREATE TABLE `address` (
  `street_number` varchar(8) default NULL,
  `street_name` varchar(64) default NULL,
  `street_type` varchar(8) default NULL,
  `city` varchar(32) default NULL,
  `state` char(2) default NULL,
  `zip` decimal(5,0) unsigned zerofill default NULL,
  `longitude` double default NULL,
  `latitude` double default NULL,
  `exact` enum('Yes','No') NOT NULL default 'Yes',
  `timestamp` datetime default NULL,
  `status` enum('Active','Disabled') NOT NULL,
  `id` int(36) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `street_number` (`street_number`,`street_name`,`street_type`,`city`,`state`,`zip`),
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Code: Select all

CREATE TABLE `t_station` (
  `title` varchar(32) default NULL,
  `line` varchar(8) default NULL,
  `longitude` double default NULL,
  `latitude` double default NULL,
  `timestamp` datetime default NULL,
  `status` enum('Active','Disabled') NOT NULL,
  `id` int(2) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `title` (`title`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Much better looking now. :) (Highlighted)

hmm, well subqueries aren't the fastest things I've noticed, nor is running large numbers of MD5().
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

Are the queries slow, or is it the PHP script (or are you weven usign a PHP script)?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Forgot to mention this earlier: Something to look into is an EXPLAIN on the query to see what columns are being used and such to see if you can add indexes or change up the query to be more specific by ruling out data earlier rather than later.
User avatar
Todd_Z
Forum Regular
Posts: 708
Joined: Thu Nov 25, 2004 9:53 pm
Location: U Michigan

Post by Todd_Z »

There is a php calculation being done after I pull the data, but that's not the bottle neck, it's just the query.

I'm gonna do some research using EXPLAIN and report back.

In terms of replacing md5, got any suggestions? Chances are all the lats will be between 40 and 45, and longs will be -70 to -72. All I need to do is make sure that the values haven't changed, thats the only purpose of the md5.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Todd_Z wrote:In terms of replacing md5, got any suggestions? Chances are all the lats will be between 40 and 45, and longs will be -70 to -72. All I need to do is make sure that the values haven't changed, thats the only purpose of the md5.
Straight string compare? You're mashing the fields together as is, so maybe a straight string compare would be faster/easier for MySQL to perform.
User avatar
Todd_Z
Forum Regular
Posts: 708
Joined: Thu Nov 25, 2004 9:53 pm
Location: U Michigan

Post by Todd_Z »

Fair enough.
User avatar
Todd_Z
Forum Regular
Posts: 708
Joined: Thu Nov 25, 2004 9:53 pm
Location: U Michigan

Post by Todd_Z »

The result of my explain on the select query:

Code: Select all

+----+--------------------+-----------+------+---------------+------+---------+------+------+-----------------------------------------------------+
| id | select_type        | table     | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+--------------------+-----------+------+---------------+------+---------+------+------+-----------------------------------------------------+
|  1 | PRIMARY            | t_station | ALL  | NULL          | NULL | NULL    | NULL |  112 | Using where |
|  1 | PRIMARY            | address   | ALL  | NULL          | NULL | NULL    | NULL | 5734 | Using where |
|  2 | DEPENDENT SUBQUERY | NULL      | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE noticed after reading const tables |
+----+--------------------+-----------+------+---------------+------+---------+------+------+-----------------------------------------------------+
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

IMPOSSIBLE does not sound good.
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Post by AKA Panama Jack »

Also, you haven't indexed any of the fields you use in your query. Indexing will speed up your query quite a bit.

As it stands now you are doing MULTIPLE full table scans because of the lack of indexing. This is why it takes so long and gets longer as more records are added.

And you might want to change your query to something like this...

Code: Select all

SELECT address.id AS aid, t_station.id AS tid FROM address, t_station WHERE address.status = 'Active' AND 
t_station.status = 'Active' AND ( SELECT COUNT(*) FROM t_relation WHERE t_relation.status = 'Active' AND t_relation.address_id = address.id AND t_relation.t_station_id = t_station.id AND MD5(CONCAT_WS('-',address.latitude,address.longitude)) = md5_address AND MD5(CONCAT_WS('-',t_station.latitude,t_station.longitude)) = t_relation.md5_station ) = 0 LIMIT 2500
Post Reply