PHP: Using remote proxies - Unreliability issues
Posted: Thu May 04, 2006 7:33 am
** Background Info **
I have a function that fetches content from a page using a random remote proxy. The proxy list is updated daily and there should be no connectivity issues. If a proxy fails $contents returns false and the calling page decides how many times it should retry fetching that page with a different proxy. That is all fine.
** The problem **
The function fails too many times, The calling script reports several "Too many proxies tried" yet a fraction of pages are fetched successfully. Can you suggest a possible cause for this?
Any help will be greatly appreciated and hopefully repaid
** The PHP code **
I have a function that fetches content from a page using a random remote proxy. The proxy list is updated daily and there should be no connectivity issues. If a proxy fails $contents returns false and the calling page decides how many times it should retry fetching that page with a different proxy. That is all fine.
** The problem **
The function fails too many times, The calling script reports several "Too many proxies tried" yet a fraction of pages are fetched successfully. Can you suggest a possible cause for this?
Any help will be greatly appreciated and hopefully repaid
** The PHP code **
Code: Select all
<?php
function get_new_proxy()
{
// all you need to know is this function
// gets a proxy in the format array([URL], [port])
}
// Fetch page, returns content and headers
function fetch($host, $url)
{
global $dir;
global $retries;
static $current_proxy_fetches;
// Attempt to connect to the proxy server to retrieve the remote page
if (!@$current_proxy_fetches || $current_proxy_fetches++ > 10) {
$current_proxy_fetches = 0;
if (!ereg("-noproxy-?", $modifiers))
list($proxy_address, $proxy_port) = get_new_proxy();
if (!$socket = @fsockopen($proxy_address, $proxy_port, $errno, $errstring, 20)) {
$filename = "{$dir['data']}/proxy_blacklist.txt";
$fp = fopen($filename, 'a+');
fwrite($fp, date("d/m/y H:i") . " $proxy_address:$proxy_port")
or log_error("Could not write to file '$filename'");
fclose($fp);
$retries++;
if ($retries < 3) {
list($_proxy_address, $_proxy_port) = get_new_proxy();
$contents = fetch($_domain, $_path, $_proxy_address, $_proxy_port);
return $contents;
}else{
$retries = 0;
return false;
}
}
$current_proxy_fetches++;
}
// If socket connection successful, reset retries counter
$retries = 0;
// HTTP commands
$headers = "GET $url HTTP/1.1\r\n";
$headers .= "Host: $host\r\n";
$headers .= "Connection: Close\r\n";
$headers .= "\r\n";
// Init. $contents var
$contents = "";
// Get the contents
if ($socket) {
fwrite($socket, $headers);
while (!feof($socket)){
$contents .= fgets($socket, 128);
}
fclose($socket);
}
/* Contents contains both the html headers and the html of the page. */
return $contents;
}
?>