Page 1 of 2
Can this be optimized?
Posted: Sat Feb 21, 2009 5:38 pm
by Citizen
I've got this bit of code that I wrote a long time ago that finds the total number of games the users have the highscore for...
Code: Select all
// highest scores this month
$sql = "SELECT `gameid`,`isreverse` FROM `arcade_games`";
$result = mysql_query($sql);
$arcade_champs = array();
$checktime = time() - (86400 * 30);
while($row = mysql_fetch_assoc($result)){
$gameid = $row['gameid'];
$isreverse = $row['isreverse'];
if($isreverse == 1)
$sql2 = "SELECT `username` FROM `arcade_highscores` WHERE `gamename` = '$gameid' AND `time` > '$checktime' ORDER BY `score` ASC LIMIT 1";
else
$sql2 = "SELECT `username` FROM `arcade_highscores` WHERE `gamename` = '$gameid' AND `time` > '$checktime' ORDER BY `score` DESC LIMIT 1";
$result2 = mysql_query($sql2);
if(mysql_num_rows($result2) == 1){
$row2 = mysql_fetch_assoc($result2);
$user = $row2['username'];
if(isset($arcade_champs[$user]))
$arcade_champs[$user] += 1;
else
$arcade_champs[$user] = 1;
}
}
But with a few hundred games it starts using a ton of queries. Does anyone have any optimization tips here?
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 6:15 pm
by Eran
Use a join to fetch all the games with their highscores in one query.
http://dev.mysql.com/doc/refman/5.0/en/join.html
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 6:17 pm
by Citizen
I did try that before. Wouldn't subquerying be adding the same number of requests and load?
Edit: I remember the other problem I ran into, I have to check to see if isreverse is 1 or 0 before I know how whether I want the highest or lowest score for the particular game.
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 7:13 pm
by Eran
I said join, not subquery. Have a look at the link I gave. It is certainly doable with a simple join - and regardless it would still be one query instead of several hundreds
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 7:34 pm
by Weirdan
pytrin wrote:I said join, not subquery. Have a look at the link I gave.
Pytrin, the original poster wanted to get champions, not highest scores. It's almost impossible to get in one query (without subqueries) with his table structure, unless you're doing some black string magic in sql (which I would not recommend).
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 8:04 pm
by Eran
Yes you are correct, didn't notice that bit about the champions. Still would be just one query (join + subquery) instead of hundreds
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 8:21 pm
by wpsd2006
How about "view" ? will it help ?
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 8:43 pm
by Eran
This query is not as simple as I first thought because of the different aggregations. So here is my suggested solution:
Code: Select all
SELECT ah.`username` , COUNT( ah.username ) AS champion
FROM (
SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
FROM `arcade_highscores`
GROUP BY gamename
) AS m
INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
INNER JOIN `arcade_highscores` AS ah ON (
m.maxscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =0
) OR (
m.minscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =1
)
GROUP BY ah.username
This query looks somewhat complicated, but can actually run pretty fast if indexed correctly and the right column types are used (something tells me that the gamename column is not an integer..)
Re: Can this be optimized?
Posted: Sat Feb 21, 2009 9:06 pm
by Benjamin
Nice

Re: Can this be optimized?
Posted: Sun Feb 22, 2009 9:15 am
by Citizen
pytrin wrote:This query is not as simple as I first thought because of the different aggregations. So here is my suggested solution:
Code: Select all
SELECT ah.`username` , COUNT( ah.username ) AS champion
FROM (
SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
FROM `arcade_highscores`
GROUP BY gamename
) AS m
INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
INNER JOIN `arcade_highscores` AS ah ON (
m.maxscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =0
) OR (
m.minscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =1
)
GROUP BY ah.username
This query looks somewhat complicated, but can actually run pretty fast if indexed correctly and the right column types are used (something tells me that the gamename column is not an integer..)
Thanks! I will certainly test that out. Does anyone by chance know of a way for me to natively check performance difference between the two? (with either php or mysql)
Re: Can this be optimized?
Posted: Sun Feb 22, 2009 9:26 am
by Eran
If you run a query in either the mysql command line or in phpMyAdmin you'll get the time it took to execute the query. You can also set up a simple benchmark yourself using microtime() before and after the query.
Re: Can this be optimized?
Posted: Sun Feb 22, 2009 9:54 am
by Citizen
Old query way: 10.6981248856 seconds
New query way: 4.7131280899 seconds
50% optimization is good
One more question, I've tried a few variations to fix the one last problem. In the event of a tie (the top score is say, 50, for a particular game and two users have the same highscore) the winner is determined by which one was submitted first, the column is `time` in the arcade_highscores table. I tried adding an ORDER BY clause to the inner query with no luck.
Re: Can this be optimized?
Posted: Sun Feb 22, 2009 10:41 am
by Eran
You can probably still optimize this query to run in sub miliseconds range with the proper indexes. Run EXPLAIN on the query and post the results here.
Regarding your other requirement - this might require another nested subquery, at which point I think it might be better to do it in the application code - by removing the COUNT and GROUP BY and manually deciding which is the champion (you still get all the data in one query though). Maybe Vlad has a better idea..
Re: Can this be optimized?
Posted: Sun Feb 22, 2009 11:04 am
by Citizen
A few problems:
Code: Select all
SELECT ah.gamename, ah.score
FROM arcade_highscores ah
JOIN (SELECT gamename,
case
when ag.isreverse = 1 then min(score)
else max(score)
end as m_score
FROM arcade_highscores mh JOIN arcade_games ag ON mh.gamename = ag.gameid
GROUP BY gamename) high
ON high.m_score = ah.score AND high.gamename = ah.gamename
1. This does not return the username column, which I added to the first select but I'm pretty sure its just returning a random username
2. This query returns 1000+ records when there are only 751 games
Here's the result of EPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 704
1 PRIMARY ah ALL NULL NULL NULL NULL 31295 Using where
2 DERIVED mh ALL NULL NULL NULL NULL 31295 Using temporary; Using filesort
2 DERIVED ag eq_ref PRIMARY PRIMARY 4 gamerbio_gbdb.mh.gamename 1 Using where
If it helps, I do have complete control over the db and can add indexes, etc if needed.
Re: Can this be optimized?
Posted: Sun Feb 22, 2009 11:21 am
by Citizen
Actually, I just noticed that pyrtin changed the query...
Here's the results for:
Code: Select all
SELECT ah.`username` , COUNT( ah.username ) AS champion
FROM (
SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
FROM `arcade_highscores`
GROUP BY gamename
) AS m
INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
INNER JOIN `arcade_highscores` AS ah ON (
m.maxscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =0
) OR (
m.minscore = ah.score
AND m.gamename = ah.gamename
AND ag.isreverse =1
)
GROUP BY ah.username
Execution time: 9.6353828907
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 707 Using temporary; Using filesort
1 PRIMARY ag eq_ref PRIMARY PRIMARY 4 m.gamename 1 Using where
1 PRIMARY ah ALL NULL NULL NULL NULL 31295 Using where
2 DERIVED arcade_highscores ALL NULL NULL NULL NULL 31295 Using temporary; Using filesort
Returns a the username and the amount of highscores the user has, but when I add them all up, it comes to 1086 even though there are only 751 games.