Page 1 of 1

How To Extract/Fetch HTML source code from another website?

Posted: Fri Jun 22, 2007 10:05 pm
by zerodevice
Hi, I'm trying to code my php that allows me to extract or fetch the html codes from another website, then i'll filter it myself to get only the specific text i want and display or echo it directly to my page.

e.g. you goto my page, and it will display a list of google's search result based on a fixed search string i code into the page.

e.g.search "asdf"

in google it will show "http://www.google.com.my/search?hl=en&q ... arch&meta="

in my page it will show:
asdf
http://www.asdf.com/ - 3k - Cached - Similar pages

What is asdf?
http://www.asdf.com/whatisasdf.html - 5k - Cached - Similar pages

CLiki : asdf
http://www.cliki.net/asdf - 17k - Cached - Similar pages

CLiki : ASDF-Install
http://www.cliki.net/ASDF-Install - 34k - Cached - Similar pages

Association Of Synchronous Data Formats
http://www.asdf.org/ - 4k - Cached - Similar pages

Home row - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Home_row - 16k - Cached - Similar pages

asdf Manual
constantly.at/lisp/asdf/ - 11k - Cached - Similar pages

ASDF - A Simple DVD Frontend for MPlayer
asdf-mplayer.sourceforge.net/ - 4k - Cached - Similar pages

asdf-jkl - Google Code
code.google.com/p/asdf-jkl/ - 7k - Cached - Similar pages

http://www.myspace.com/asdfrock
profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=31856324 - 138k - 21 Jun 2007 - Cached -
these text adn hyperlinks are extracted instantly the moment they goto my site.


i know its a dumb function, but i have my reasons.

please help me.

thanks.

Posted: Fri Jun 22, 2007 10:06 pm
by Benjamin
regex curl lawyer

Posted: Fri Jun 22, 2007 10:38 pm
by The Phoenix
astions wrote:regex curl lawyer
That was concise, informative, and dead accurate. Awesome.

Posted: Fri Jun 22, 2007 11:07 pm
by John Cartwright
Just to follow up, in case it wasn't clear, google prohibits the usage of html scrapping from their search engine --- as noted in their terms of service.

Posted: Sat Jun 23, 2007 8:21 am
by zerodevice
Jcart wrote:Just to follow up, in case it wasn't clear, google prohibits the usage of html scrapping from their search engine --- as noted in their terms of service.
yes, i understand that google doesn't allow such action, however i am not going to apply this on google.
i am using it for some other websites with informations i need.

google is just an example so that most people will understand wat i want.

Posted: Sun Jun 24, 2007 9:13 am
by Gente
fopen() can also be useful