Page 1 of 1

Optimizing random row selection.

Posted: Mon Feb 15, 2010 4:29 pm
by tymlls05
The query's below often cause this error: Fatal error: Maximum execution time of 30 seconds exceeded in /home/autoEvent.php on line 228

This is the code causing the error:

Code: Select all

 
//Select RANDOM EVENT (AGENT) TO ASSIGN TO DAY
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `event_categories_events` WHERE event_category_id = '2'  ") or die(mysql_error());
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$floor_events = mysql_query( " SELECT * FROM `event_categories_events` WHERE event_category_id = '2' LIMIT $offset, 1 " ); 
 
//Grab Random Event from All Events  (Agents) Assigned To The Floor Schedule Category
while($grabFloorEvent = mysql_fetch_array($floor_events)){
    //For every event id we grab, grab the event (agent) details.
    $agent=$grabFloorEvent['event_id'];
    while(in_array($agent, $_SESSION['used'])) {
        while($grabFloorEvent = mysql_fetch_array($floor_events)){
            $agent=$grabFloorEvent['event_id'];
            break;
        }
    }
What can I do to keep this loop from taking so many times. Can I do a select from where EVENT_ID NOT in array?


EDIT: I am trying this to with no success: mysql_query( " SELECT * FROM `event_categories_events` WHERE event_category_id = '2' && event_id NOT IN ('".implode(',', $chosen)."') LIMIT $offset, 1 " );

Code: Select all

 
$chosen=$_SESSION['used'];
$floor_events = mysql_query( " SELECT * FROM `event_categories_events` WHERE event_category_id = '2' && event_id NOT IN ('".implode(',', $chosen)."') LIMIT $offset, 1 " ); 
                   

Re: Optimizing random row selection.

Posted: Mon Feb 15, 2010 6:07 pm
by Darhazer
At first look:
Select count(*) in the first query, than use the PHP's rand() function to calculate the offset.
Oh, you are trying to get a random row?
Usually it's performed just by:

Code: Select all

SELECT * FROM `event_categories_events` WHERE event_category_id = '2' ORDER BY RAND() LIMIT 0, 1

Re: Optimizing random row selection.

Posted: Mon Feb 15, 2010 6:16 pm
by Eran
Your NOT IN approach would have worked if you hadn't included the extra pair of single quotes inside the brackets - you either quote each member in the list or you don't use quotes at all, and assuming event_id is an integer I would go for the latter.

Re: Optimizing random row selection.

Posted: Mon Feb 15, 2010 10:46 pm
by requinix
Darhazer wrote:At first look:
Select count(*) in the first query, than use the PHP's rand() function to calculate the offset.
Oh, you are trying to get a random row?
Usually it's performed just by:

Code: Select all

SELECT * FROM `event_categories_events` WHERE event_category_id = '2' ORDER BY RAND() LIMIT 0, 1
The problem with that is if the table is really large, the query can take some time. It has to go through each row, assign a random number, then resort the table on that number.

The best solution to date is what OP has.

Re: Optimizing random row selection.

Posted: Tue Feb 16, 2010 3:41 am
by Eran
http://www.mysqlperformanceblog.com/200 ... imization/
You can read in the comments for more solutions

Re: Optimizing random row selection.

Posted: Wed Feb 17, 2010 4:01 pm
by tymlls05
Looking at the website you provided, I decided that MySQL really shouldn't be doing all the random calculations anyway. I don't have to worry much about scalability with this script so the solution I found was to not leave it up to the database to do all the calculations of selecting a random row.

My Solution was to create an array that contained the row ID's and let php do the processing of random selection.

Code: Select all

 
//Find all rows and combine them to an array to be sorted through later
     $floor_events=mysql_query("SELECT event_id FROM event_categories_events WHERE event_category_id=2"); 
     $events_to_use=array();
     while($grab_events_to_use = mysql_fetch_row($floor_events)){
          //Add Each Event To $events_to_use
           $events_to_use[]=$grab_events_to_use[0];
      }
      print_r($events_to_use);
 
Implementation:

Code: Select all

 
//Grab Random ROW ID from All Rows previously combined as an array in $floor_events
        $agent=array_rand($events_to_use, 1);
 
I went from server script processing time-outs to split second processing doing this. Thanks for all of the recommendations!!

Re: Optimizing random row selection.

Posted: Wed Feb 17, 2010 6:21 pm
by requinix
No offense but that's a bad solution. The script is buffering everything in that table matching that condition... and then only using one row of it. It's using a lot more memory than it needs to. And the kicker: it's slow. Relatively speaking, of course.

The method you had before is better: figure out how many items are available, pick a random number, then tell MySQL to fetch that result. If your script was having timeout issues then this process isn't at fault and something else is causing problems.

oooh the little things in life... err programming.

Posted: Thu Feb 18, 2010 10:56 am
by tymlls05
Wow. If I had a penny for every time something small was the root of all the evil I'd be rich given the fact that they usually occur in infinite loops.

The following code was used to select only what had not been used.

Code: Select all

                $agent=array_rand($events_to_use, 1);
                if ($_SESSION['used']){
                    while(in_array($agent, $_SESSION['used'])) {
                        $agent=array_rand($events_to_use, 1);
                    }
                }
Every *total number of row* times I had to unset $_session['used'] since it was full. I wasn't keeping count properly so by the end of the script the while loop was looking for what had not been used, when all of the rows had been.

The reason I couldn't figure out exactly what was going wrong in the loop was (disregarding the fact I spent too much time messing with this yesterday) the function I used to keep count of what had been used was working fine the first two times it was called upon and it also it was out of site, out of mind. it wasn't being directly being demanded by the line reported by error.