Can this be optimized?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Can this be optimized?

Post by Citizen »

I've got this bit of code that I wrote a long time ago that finds the total number of games the users have the highscore for...

Code: Select all

// highest scores this month
$sql = "SELECT `gameid`,`isreverse` FROM `arcade_games`";
$result = mysql_query($sql);
    
$arcade_champs = array();
$checktime = time() - (86400 * 30);
    
while($row = mysql_fetch_assoc($result)){
    $gameid = $row['gameid'];
    $isreverse = $row['isreverse'];
    if($isreverse == 1)
        $sql2 = "SELECT `username` FROM `arcade_highscores` WHERE `gamename` = '$gameid' AND `time` > '$checktime' ORDER BY `score` ASC LIMIT 1";
    else
        $sql2 = "SELECT `username` FROM `arcade_highscores` WHERE `gamename` = '$gameid' AND `time` > '$checktime' ORDER BY `score` DESC LIMIT 1";
    $result2 = mysql_query($sql2);
    if(mysql_num_rows($result2) == 1){
        $row2 = mysql_fetch_assoc($result2);
        $user = $row2['username'];
        if(isset($arcade_champs[$user]))
            $arcade_champs[$user] += 1;
        else
            $arcade_champs[$user] = 1;
    }
}
But with a few hundred games it starts using a ton of queries. Does anyone have any optimization tips here?
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

Use a join to fetch all the games with their highscores in one query.
http://dev.mysql.com/doc/refman/5.0/en/join.html
Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Re: Can this be optimized?

Post by Citizen »

I did try that before. Wouldn't subquerying be adding the same number of requests and load?

Edit: I remember the other problem I ran into, I have to check to see if isreverse is 1 or 0 before I know how whether I want the highest or lowest score for the particular game.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

I said join, not subquery. Have a look at the link I gave. It is certainly doable with a simple join - and regardless it would still be one query instead of several hundreds
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: Can this be optimized?

Post by Weirdan »

pytrin wrote:I said join, not subquery. Have a look at the link I gave.
Pytrin, the original poster wanted to get champions, not highest scores. It's almost impossible to get in one query (without subqueries) with his table structure, unless you're doing some black string magic in sql (which I would not recommend).
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

Yes you are correct, didn't notice that bit about the champions. Still would be just one query (join + subquery) instead of hundreds
wpsd2006
Forum Commoner
Posts: 66
Joined: Wed Jan 07, 2009 12:43 am

Re: Can this be optimized?

Post by wpsd2006 »

How about "view" ? will it help ?
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

This query is not as simple as I first thought because of the different aggregations. So here is my suggested solution:

Code: Select all

SELECT ah.`username` , COUNT( ah.username ) AS champion
FROM (
    SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
    FROM `arcade_highscores`
    GROUP BY gamename
) AS m
INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
INNER JOIN `arcade_highscores` AS ah ON ( 
    m.maxscore = ah.score
    AND m.gamename = ah.gamename
    AND ag.isreverse =0 
) OR (
    m.minscore = ah.score
    AND m.gamename = ah.gamename
    AND ag.isreverse =1
)
GROUP BY ah.username
This query looks somewhat complicated, but can actually run pretty fast if indexed correctly and the right column types are used (something tells me that the gamename column is not an integer..)
Last edited by Eran on Sat Feb 21, 2009 9:10 pm, edited 1 time in total.
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Can this be optimized?

Post by Benjamin »

Nice :)
Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Re: Can this be optimized?

Post by Citizen »

pytrin wrote:This query is not as simple as I first thought because of the different aggregations. So here is my suggested solution:

Code: Select all

SELECT ah.`username` , COUNT( ah.username ) AS champion
FROM (
    SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
    FROM `arcade_highscores`
    GROUP BY gamename
) AS m
INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
INNER JOIN `arcade_highscores` AS ah ON ( 
    m.maxscore = ah.score
    AND m.gamename = ah.gamename
    AND ag.isreverse =0 
) OR (
    m.minscore = ah.score
    AND m.gamename = ah.gamename
    AND ag.isreverse =1
)
GROUP BY ah.username
This query looks somewhat complicated, but can actually run pretty fast if indexed correctly and the right column types are used (something tells me that the gamename column is not an integer..)
Thanks! I will certainly test that out. Does anyone by chance know of a way for me to natively check performance difference between the two? (with either php or mysql)
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

If you run a query in either the mysql command line or in phpMyAdmin you'll get the time it took to execute the query. You can also set up a simple benchmark yourself using microtime() before and after the query.
Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Re: Can this be optimized?

Post by Citizen »

Old query way: 10.6981248856 seconds

New query way: 4.7131280899 seconds

50% optimization is good :)

One more question, I've tried a few variations to fix the one last problem. In the event of a tie (the top score is say, 50, for a particular game and two users have the same highscore) the winner is determined by which one was submitted first, the column is `time` in the arcade_highscores table. I tried adding an ORDER BY clause to the inner query with no luck.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Can this be optimized?

Post by Eran »

You can probably still optimize this query to run in sub miliseconds range with the proper indexes. Run EXPLAIN on the query and post the results here.

Regarding your other requirement - this might require another nested subquery, at which point I think it might be better to do it in the application code - by removing the COUNT and GROUP BY and manually deciding which is the champion (you still get all the data in one query though). Maybe Vlad has a better idea..
Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Re: Can this be optimized?

Post by Citizen »

A few problems:

Code: Select all

SELECT ah.gamename, ah.score
    FROM arcade_highscores ah
    JOIN (SELECT gamename, 
                 case 
                   when ag.isreverse = 1 then min(score) 
                   else max(score)
                 end as m_score
         FROM arcade_highscores mh JOIN arcade_games ag ON mh.gamename = ag.gameid
         GROUP BY gamename) high 
      ON high.m_score = ah.score AND high.gamename = ah.gamename
1. This does not return the username column, which I added to the first select but I'm pretty sure its just returning a random username
2. This query returns 1000+ records when there are only 751 games

Here's the result of EPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 704
1 PRIMARY ah ALL NULL NULL NULL NULL 31295 Using where
2 DERIVED mh ALL NULL NULL NULL NULL 31295 Using temporary; Using filesort
2 DERIVED ag eq_ref PRIMARY PRIMARY 4 gamerbio_gbdb.mh.gamename 1 Using where
If it helps, I do have complete control over the db and can add indexes, etc if needed.
Citizen
Forum Contributor
Posts: 300
Joined: Wed Jul 20, 2005 10:23 am

Re: Can this be optimized?

Post by Citizen »

Actually, I just noticed that pyrtin changed the query...

Here's the results for:

Code: Select all

SELECT ah.`username` , COUNT( ah.username ) AS champion
        FROM (
            SELECT MAX( score ) AS maxscore, MIN( score ) AS minscore, gamename
            FROM `arcade_highscores`
            GROUP BY gamename
        ) AS m
        INNER JOIN `arcade_games` AS ag ON `ag`.`gameid` = `m`.`gamename`
        INNER JOIN `arcade_highscores` AS ah ON (
            m.maxscore = ah.score
            AND m.gamename = ah.gamename
            AND ag.isreverse =0
        ) OR (
            m.minscore = ah.score
            AND m.gamename = ah.gamename
            AND ag.isreverse =1
        )
        GROUP BY ah.username
Execution time: 9.6353828907

Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 707 Using temporary; Using filesort
1 PRIMARY ag eq_ref PRIMARY PRIMARY 4 m.gamename 1 Using where
1 PRIMARY ah ALL NULL NULL NULL NULL 31295 Using where
2 DERIVED arcade_highscores ALL NULL NULL NULL NULL 31295 Using temporary; Using filesort
Returns a the username and the amount of highscores the user has, but when I add them all up, it comes to 1086 even though there are only 751 games.
Post Reply