PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
I've been trying (unsucessfully) to scrape a site called Managerzone.com to make a help tool for the site. Many other people have done this but simply don't care to help me out . I'm not a very experienced PHP programmer so here goes...
The problem with scraping the site is that whenever you attempt to scrape any of the internal pages (i.e., the pages you can access after you've logged in), it redirects you to the main index page. Does this make sense? What I'd like to do is somehow log in to the website via PHP using my login information and then screen scrape the pages inside. Is this even possible or am I blowing smoke?
Reguardless, using cURL you can send multiple requests (curl_exec()) in a single page load, for instance: one to login, and one to fetch the content, as long as you use the same curl handle.
Everah wrote:If the content you are scraping is behind a login screen, doesn't it seem kinda shady to make that content available to users that are not logged in?
I didn't want to say it.. Been holding it in for hours... But yeah, if they put it behind a login, they probably don't want it to be publicly accessed. Then again, for all I know, this is your account for an online game or something.
This is my account for the game. I'm simply using my login info to get in and retrieving the info (for my own use only) from there. It's also a free site.
Thanks for pointing me at cURL. Hopefully I'll get it to work. A quick question, when I POST variables do they need to be urlencoded?
arkdm wrote:This is my account for the game. I'm simply using my login info to get in and retrieving the info (for my own use only) from there. It's also a free site.
I wrote:Then again, for all I know, this is your account for an online game or something.
I must be psychic :-p
arkdm wrote:Thanks for pointing me at cURL. Hopefully I'll get it to work. A quick question, when I POST variables do they need to be urlencoded?