Page 1 of 1
Grabbing data withing table tag from other website
Posted: Tue Apr 25, 2006 8:31 am
by angelena
Harloo..
Im trying to grab data from website such as stock market and to have the data save into my database.
Currently, i tried using curl and able to get the site and write the whole of it into a txt file.
But im stuck at there as i don't know how shud i proceed to read data enclosed withing the table tag.
Can anyone give me some guidance?
Thanks
Posted: Tue Apr 25, 2006 9:15 am
by litebearer
What you are trying to do is often called 'scraping'.
Here is a class (robot) that I use and find easy to implement.
http://www.free-php.org.uk/free2.php
I generally use the text option, then do a series of different 'search/replace/delete' functions to cull out the data I am seeking. This works well, provided the site you are culling has some regular consistencies as to structure.
ie the page always has 5 tables of which we always want the info in table 3. put all the text into one string, then use a string function to locate position of the third occurence of <table. delete everything infront of that point. now take the remaining string and locate the first occurrence of </table. delete everything FROM that point to the end.
from the remaining string find the first occurrence of >, dele from that point forward.
your now should be left with row and cell tags. develop a simple routine to extract the data for placement in your database.
I use a little routine that uses the pipe | , the ~ or a combination of both as a delimiter. this allows me to have commas and quotes in my variables.
Hope this is of some help.
Lite...