file_get_contents to CURL

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
EricS
Forum Contributor
Posts: 183
Joined: Thu Jul 11, 2002 12:02 am
Location: Atlanta, Ga

file_get_contents to CURL

Post by EricS »

I've got an application that retrieves hunderds of web pages when it runs. I was using file_get_contents() to retrieve each page and the script would take several minutes to run. But even though it took several minutes the browser would never time out, it would just wait for the application to finish running.

I decided to take a stab at using CURL in the hopes it would make things more efficient. Namely, I wanted the application to stop trying to retrieve a site after a specified period of time and go on to the next site. Since I couldn't find a way to do that with file_get_contents, I start to write CURL in it's place.

What's happening now is that browser is timing out when I use the curl code and I can't find a way to keep it from happening. Here is the code I'm currently employing.

Code: Select all

<?php
$curlResource = curl_init();
// set CURL options
curl_setopt($curlResource, CURLOPT_URL, 'http://'.$protocolessURL); // set destination.
curl_setopt($curlResource, CURLOPT_FOLLOWLOCATION, true); // allow CURL to follow redirect headers.
curl_setopt($curlResource, CURLOPT_MAXREDIRS, 2); // allow only on redirect before failure.
// curl_setopt($curlResource, CURLOPT_MUTE, true); // run without errors being displayed.
curl_setopt($curlResource, CURLOPT_RETURNTRANSFER, true); // return output as string rather than to screen.
curl_setopt($curlResource, CURLOPT_CONNECTTIMEOUT, $connectionTimeOut); // set time limit on connection
curl_setopt($curlResource, CURLOPT_LOW_SPEED_TIME, $attemptTimeOut); // set time to fail per fetch attempt
curl_setopt($curlResource, CURLOPT_TIMEOUT, $maxTimeOut); // maximum time a curl function can run
curl_setopt($curlResource, CURLOPT_USERAGENT, $userAgent); // the user agent to be sent in http requests
curl_setopt($curlResource, CURLOPT_NOSIGNAL, true);
// execute CURL operation
$contents = curl_exec($curlResource);
print curl_error($curlResource);
// close CURL session
curl_close($curlResource);
?>
User avatar
Joe
Forum Regular
Posts: 939
Joined: Sun Feb 29, 2004 1:26 pm
Location: UK - Glasgow

Post by Joe »

EricS
Forum Contributor
Posts: 183
Joined: Thu Jul 11, 2002 12:02 am
Location: Atlanta, Ga

Post by EricS »

So does a call to sleep() signal the browser to restart it's connection time? I don't see anything in the PHP Manual that says this.
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

take a look at the different headers that your web page is sending when using curl, there must be something different.

this forum sends this header for example:

Keep-Alive: timeout=25, max=100
User avatar
ol4pr0
Forum Regular
Posts: 926
Joined: Thu Jan 08, 2004 11:22 am
Location: ecuador

Post by ol4pr0 »

And no sleep does not signal the browser to restart

sleep() means sleep :)
Post Reply