Page 1 of 1

Data aggregation, scripting & MySQL

Posted: Thu Mar 11, 2010 12:22 am
by marvinzzz
Hello PHP pros!

Pardon my ignorance, I'm an utter newb!
So I'm building an aggregator website and I was wanted to ask some questions about the process, logic and workflow behind maintaining the data.

I have around 20 different products from different suppliers that are in PDF or XLS format. Each supplier describes their product information in a different way so this means that I obviously can't just extract product data using ONE mechanism.. I need 20. Further, product data changes once every while, so I don't want to manually have to maintain the database. I need a system that can be run to extract data from PDF/XLS sources to reduce overhead.

So I have the following questions please:
  • What is the 'best' way to go about my problem: extracting data from raw sources, converting it into a standard form, inserting into MySQL, scripting the automation process, etc? Can you explain broadly the basic steps required to do what I need?
  • Should I look to convert the PDF/XLS files into HTML/CSV/Text files/some other format? What is the 'best' format to convert my source documents into assuming that my website is in PHP/MySQL and why?
Any ideas/suggestions? Thanks!