Would like advice retrieving a large data set

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
Kadanis
Forum Contributor
Posts: 180
Joined: Tue Jun 20, 2006 8:55 am
Location: Dorset, UK
Contact:

Would like advice retrieving a large data set

Post by Kadanis »

Hi All

Basically I put this in here cos its mainly MySQL relate, but there is elements of PHP as well.

For Starters the system is currently running MySQL 4.1.20 and we won't be upgrading to 5+ until the new server goes in. PHP Version is 5.0.4.

Ok, in the MySQL server are 2 databases. 1 contains numerous (and potentially massive i.e. 1million records +) tables. The other contains general information tables, 1 of which at the moment stands at approximately 600'000 records called tracking. The CSV file needs to be a cross reference between the tracking table and any 1 of the other tables.

I've been using JOIN statments to get the data sets, and then attempting to write the whole thing to CSV, but I'm getting timeouts right left and center.

Anyone have any suggestions on how I could improve this so that in the few cases where data gets this big, we can still download it.

The CSV doesn't have to be created in real-time so using something trigger from CRON etc could also work

Thanks in advance to anyone who makes a suggestion ;)
Paw
Forum Newbie
Posts: 20
Joined: Tue Jul 17, 2007 10:27 am

Post by Paw »

Since there's hardly a way to improve speed on such a massive operation, you probably need to change PHP's timeout limit, either locally in your download script (recommended) or globally in PHP.ini (not so good).

See http://www.php.net/manual/en/function.s ... -limit.php for more information.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Post by califdon »

Good reference, Paw, but note in that reference:
Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running.
Post Reply