Page 1 of 1

Newbie Issue With Super-Slow Load of Text file into MySQL

Posted: Mon Nov 03, 2008 2:46 pm
by mb2442
I'm attempting to automate the loading of an 80,000 record (31 fields) tab and space delimited syslog into MySQL using PHP. It works, but takes 40 minutes to run, so I'm obviously missing something basic and critical. Any suggestions on what I'm missing here is greatly appreciated.... (the array_pad is used because the latter half of the record, the space-delimited part, is frequently missing.)

$handle= @fopen($filename, "r");
if ($handle) {
while (!feof($handle)) {
$line = fgets($handle, 4096);
$myArray = preg_split('/[\s\t]+/', $line);
$dataArray = array_pad($myArray, 30, "NULL");
mysql_query('INSERT INTO `databasename`.`tablename` (`auto_id`, `field1`, `field2`, `field3`) VALUES (NULL, \'' . $dataArray[0] . '\', \'' . $dataArray[1] . '\', \'' . $dataArray[2] . '\')');
}
fclose($handle);
}

Re: Newbie Issue With Super-Slow Load of Text file into MySQL

Posted: Mon Nov 03, 2008 4:18 pm
by Syntac
Might be a little faster...

Code: Select all

$text = file_get_contents( $file );
$lines = explode( "\n", $text );
 
foreach( $lines as $line )
{
    // do stuff
} 
...not sure, though.

Re: Newbie Issue With Super-Slow Load of Text file into MySQL

Posted: Mon Nov 03, 2008 4:33 pm
by infolock
Just a question, was the idea to use both tabs and spaces an delimiters your idea? if so, change it to csv and it will greatly reduce the speed/headache of the crunching you're doing.

Also, try commenting out the MySQL insert queries and see how much that speeds it up. Might be issues with your schema.

Re: Newbie Issue With Super-Slow Load of Text file into MySQL

Posted: Wed Nov 05, 2008 8:05 am
by mb2442
Thanks very much for the suggestions, I certainly will look into them.

I'm loading a webfilter syslog dumped into a text file by someone else, and so I have no control over the text file's format. My understanding is that this webfiltering appliance cannot produce the syslog in any other format. The structure arises from the webfilter using tab delimiters, then the final field is a large amount of space delimited data which is the actual activity report.