Parse content from other pages

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
cybernike
Forum Newbie
Posts: 3
Joined: Sun Feb 08, 2009 9:34 am

Parse content from other pages

Post by cybernike »

I am totally new to php. I need a server-side programming language because I need the ability to write files.
Could someone tell me how to parse pages that are not on the server, but rather external?

My situation is I need to parse an external webpage in order to extract some of information and write out the information to a file. I could parse the information with javascript, but I can't write the information to a file without ActiveX.
Last edited by cybernike on Sun Feb 08, 2009 10:09 am, edited 1 time in total.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Parse content from other pages

Post by John Cartwright »

file_get_contents() or cURL + preg_match_all()

Probably a bit too dificult for someone that doesn't know PHP. However, post some sample HTML along with the site you want to scrape (assuming you have their permission to do so), then we can help you along furthur.
cybernike
Forum Newbie
Posts: 3
Joined: Sun Feb 08, 2009 9:34 am

Re: Parse content from other pages

Post by cybernike »

The page is actually a page from a Facebook game (so I can't let you see the page because it requires FB login information). I am not sure if you are familiar with Greasemonkey which can insert Javascript on your own browser (client-side). I wonder if I could do the same thing with php, that is, to display an external page on my own browser(client-side) with my desired javascript inserted. That way, with php, I could write the information I need from the page to a file (whereas Greasemonkey cannot do that).
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Parse content from other pages

Post by John Cartwright »

Indeed it would be possible using cURL and some regular expressions.

P.S. I have facebook too, so it still doesn't hurt post a link.
cybernike
Forum Newbie
Posts: 3
Joined: Sun Feb 08, 2009 9:34 am

Re: Parse content from other pages

Post by cybernike »

The name is Dragon Wars: http://apps.facebook.com/dragonwars/

I would like to have two of my own characters in different FB accounts to communicate with each other(for example, to exchange information about their HPs and the numbers of attacks available, etc). I was planning to output the information to a simple html file on my webserver so that it can be parsed by javascript with greasemonkey. Do you have a better suggestion as to how to do this?
Post Reply