Page 1 of 1

Grabbing data withing table tag from other website

Posted: Tue Apr 25, 2006 8:31 am
by angelena
Harloo..

Im trying to grab data from website such as stock market and to have the data save into my database.
Currently, i tried using curl and able to get the site and write the whole of it into a txt file.

But im stuck at there as i don't know how shud i proceed to read data enclosed withing the table tag.

Can anyone give me some guidance?


Thanks

Posted: Tue Apr 25, 2006 9:15 am
by litebearer
What you are trying to do is often called 'scraping'.

Here is a class (robot) that I use and find easy to implement.

http://www.free-php.org.uk/free2.php

I generally use the text option, then do a series of different 'search/replace/delete' functions to cull out the data I am seeking. This works well, provided the site you are culling has some regular consistencies as to structure.

ie the page always has 5 tables of which we always want the info in table 3. put all the text into one string, then use a string function to locate position of the third occurence of <table. delete everything infront of that point. now take the remaining string and locate the first occurrence of </table. delete everything FROM that point to the end.

from the remaining string find the first occurrence of >, dele from that point forward.

your now should be left with row and cell tags. develop a simple routine to extract the data for placement in your database.

I use a little routine that uses the pipe | , the ~ or a combination of both as a delimiter. this allows me to have commas and quotes in my variables.

Hope this is of some help.

Lite...