extract html from url
Posted: Sat Jan 26, 2008 5:04 pm
i want to be able to have a php script go through links in a website and be able to extract from the html that those links point to whatever i want it to extract. i'm pretty familiar with perl regex, and i think php has something similar, so the actual extraction shouldn't be much of a problem for me.
what i don't really know how to do is write the php code that will grab the html file from the url (the html file is actually not an html file, it is a php file, so the url would look something like this: http://somesite.com/somepage.php). i also want the php code to be able to grab the html from linked pages. i've heard about libcurl but i'm having trouble installing it on my linux distro.
could someone give me working examples of how to use curl with php (if that is what i need to use) so that it will extract the html from a specified url and all links within that url???
hope i make sense. thanks!
what i don't really know how to do is write the php code that will grab the html file from the url (the html file is actually not an html file, it is a php file, so the url would look something like this: http://somesite.com/somepage.php). i also want the php code to be able to grab the html from linked pages. i've heard about libcurl but i'm having trouble installing it on my linux distro.
could someone give me working examples of how to use curl with php (if that is what i need to use) so that it will extract the html from a specified url and all links within that url???
hope i make sense. thanks!