Firstly, I will explain what I am trying to achieve, and put things in the correct context.
I have an online service, that will pickup .xml files from another persons web-server and store them locally. Once the file is saved locally, I write the data into a MySQL table after validating each item in the file. Bad records go to a bad table, good records to a valid table.
The above, I already have working as per my design, however... I wish to secure the process a little more. Before attempting to load the XML file into the DB, I wish to do a check on the file to ensure it is an XML file.. After all, it is possible that somebody could have placed, say a .exe file with a .xml extension, and my script download it. So, is there a method to validate that a file with a .xml extension, is actually an XML file?
Now, to make things a little more complex, I need to avoid loading the entire file into memory, because a single XML file can be up to 128MB in size.
I've spent some time looking into this, but have yet to come across a solution.
Any suggestions?
Cheers