Page 1 of 1
extract html
Posted: Thu May 11, 2006 12:06 pm
by ziggy1621
hello all,
i have a client who wants a script to pull info off of a webpage and update in his. I remember reading somewhere that there are rss scripts that can search the html of a page and between pointA and pointB pull all info between and display with a php rss viewer script(which I already have).
so my question is does anyone know of a script/tutorial on how to extract the info?
thanks
Posted: Thu May 11, 2006 12:12 pm
by Burrito
I don't know of any rss tools that will do it, but you could use
file_get_contents() to grab the raw html, then use a regular expression to strip out what you need.
if you need help with the regular expression, we can provide

Posted: Thu May 11, 2006 12:27 pm
by ziggy1621
Burrito wrote:I don't know of any rss tools that will do it, but you could use
file_get_contents() to grab the raw html, then use a regular expression to strip out what you need.
if you need help with the regular expression, we can provide

i'm playin with file_get_contents and it seems to be doing the same as include($file) ... please do send me to a good tutorial on regular expressions. I'm pretty fluent in php on its relation to mysql, but outside of that, i'm learnin still
thx
Posted: Thu May 11, 2006 12:43 pm
by Burrito
file_get_contents is not the same as include. With file_get_contents you set a string variable to the contents of a file (the html in your case).
the best regex tut on the net is right here:
viewtopic.php?t=33147 
Posted: Thu May 11, 2006 12:52 pm
by ziggy1621
Burrito wrote:file_get_contents is not the same as include. With file_get_contents you set a string variable to the contents of a file (the html in your case).
the best regex tut on the net is right here:
viewtopic.php?t=33147 
kewl, well i got my readin cut out... thx for the help