Loading really large files

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Loading really large files

Post by Luke »

I have a csv file that is over 5MB and I need to parse it and convert it to an array so I can check it against a database. PHP file_get_contents() just fine, but when I go to explode() it chokes and says Allowed memory size of 8388608 bytes exhausted. How should I go about this? I've never worked with a file this large before.
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

maybe fgetcsv?
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

You could increase the memory limit in php.ini

if that's not an option and you have 5.1.0 or greater installed you can use the optional arguments (offset & maxlen):
http://www.php.net/manual/en/function.file-get-contents.php wrote:file_get_contents ( string filename [, bool use_include_path [, resource context [, int offset [, int maxlen]]]] )
to load the file in segments.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I'll agree with volka here, fgetcsv().
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

fgetcsv()
8O Cool!
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

that works perfectly... wow what a cool function! thanks.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

I've successfully used fgetcsv() to import a postal code database into mysql. The file was about 600 MB. 8O
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

:cry: I manually cut and uploaded a 50 MB CSV in 3MB chunks once. That's time I'll never get back!
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

OK, let me explain why I need this functionality and see if anybody has any better ideas. The fgetcsv function works great except now I am trying to load in two other files (that aren't quite as large, but still big). These three files are exports from vendor databases. I need to take each csv file, and add up the total amount in stock of each sku (which is a field in each csv file) and then update a mysql database with the results. Anybody know what the best way to do this would be? Just about everything I've done has resulted in:
Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocate xxx bytes)
I'm on a shared server, so my options are limited. :(
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Keep dumping them into the database, query afterward?

How many products are we talking here? It may be possible to keep an aggregate array (or variable) for just this total around.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

I think (and I'll have to double check) that it's about 10,000 per file, but the first one contains a lot more fields (and that's why it's so much larger).
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I'd go with dump to database, pull afterward then.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

like...

Code: Select all

LOAD INFILE ... 
:?:
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

If possible, sure, but fgetcsv() works too. :P
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

feyd wrote:If possible, sure, but fgetcsv() works too. :P
:? OK, so something like:

Code: Select all

foreach($file as $val)
{

	$fh = fopen($val['file_name'], 'r');
	while (($data = fgetcsv($fh, 1000, ',', '"')) !== FALSE)
	{
		$vendor_sku = getAlnum($data[0]);
		$available = (integer) getDigits($data[$val['stock_position']]);
                // UPDATE products SET in_stock = (SELECT in_stock FROM products WHERE pSku = $vendor_sku) + $available
	}
	fclose($fh);

}
Another question... if I do it like this, it will add up whatever is already in the database's in_stock on top of the three csv files in_stock values. I don't want that. So, should I set all in_stock to zero before running this? That seems like the solution, but it seems like that could potentially lead to problems. What do you think?

EDIT: sigh... it seems no matter what I do, it exceeds allowed memory size. :(
Post Reply