Page 1 of 1

Parsing large XML documents - XMLReader, PHP, MySQL

Posted: Thu Nov 08, 2007 4:20 pm
by ska
feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


Hi guys,

I'm a bit stuck on this one so could do with some pointers. I'm reading largish XML documents (around 10MB) from URLs (http://something.com/file.xml) into PHP and need to parse the nodes into their respective tables in a MySQL database. I set up one script to do it using SimpleXML (PHP5 is on the server I'm using) but was getting memory problems. I need to use a stream based parser so have turned to XMLReader but the documentation is somewhat lacking and I can't find many examples.

Can anyone give me an example to follow that will allow me to parse through an XML document in the following format;

[syntax="xml"]
<?xml version="1.0" encoding="ISO-8859-1"?>
<products>
<product>
<name><![CDATA[]]></name >
<price><![CDATA[]]></price >
<description><![CDATA[]]></description >
</product>
etc.
Bearing in mind I hae to read it into a stream and because of the number of products in the feed, it will be more efficient to create multiple INSERTS SQL, perhaps in batches of 100. i.e.[/syntax]

Code: Select all

INSERT INTO products (name, price, description)
VALUES
(x, y, z)
(a, b, c)
etc.
I won't complicate matters at the moment by saying that some of the products will need to be UPDATES rather than INSERTS just yet though... ;) Firstly I want to just learn how to use XMLReader to get from the former to the latter and I'm well-stuck with it...!

Oh, just to clarify, when I say XMLReader, I mean the one built into PHP5. i.e.

Code: Select all

$reader = new XMLReader();
$reader->open($url);
while ($reader->read())
{
	echo $reader->name.": ".$reader->value."<br />";
}
Thanks.


feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Thu Nov 08, 2007 7:17 pm
by yanisdon

Posted: Fri Nov 09, 2007 1:50 am
by ska
Thanks for the suggestion. Anyone know if magpie can handle large XML files? Bear in mind this project is on a shared host so I have memory and timeout issues to take into consideration and can"t start tweaking the ini file to suit me. I googled 'Magpie large xml" and this thread comes up number one so not much use there...!

Anyone know? Or ideally, going back to my original question, can anyone give me a XMLReader example?

Posted: Sat Nov 10, 2007 3:34 pm
by ska
Guessing that no-one's used XMLReader on here then? Well if that's the case, has anyone got any other ideas for reading a large XML document into MySQL without running into timeout and memory issues?