Page 1 of 1

There has to be a quicker way to do this...

Posted: Sun Aug 15, 2004 8:00 pm
by hawleyjr
I am trying to determine the validity (Active link or not) of a link before posting it to a bookmark section of my site.

I am using the following code which works, but is super slow. For each URL in the list of URLs I make a call to the doesLinkWork() function. First I try using [php_man]curl[/php_man] to open the link. If that doesn't work, I then try to open it using [php_man]fopen[/php_man]. I'm open to any suggestions.

If this looks familiar it is because this code was derived from the following post.

Code: Select all

<?php
function doesFileExisit($url){

	$handle = @fopen($url, "r");
	if(!$handle)
		$r_val= FALSE;
	else
		$r_val= TRUE;
	@fclose($handle);
	
	return $r_val;	
}
function getHeader($url) 
{ 
   $ch = curl_init(); 
   curl_setopt($ch, CURLOPT_URL,       		$url); 
   curl_setopt($ch, CURLOPT_HEADER,    		1); 
   curl_setopt($ch, CURLOPT_TIMEOUT,  		1); 
   curl_setopt($ch, CURLOPT_NOBODY,   		1); 
   curl_setopt($ch, CURLOPT_RETURNTRANSFER,	1); 
    
   return curl_exec($ch); 
} 

function doesLinkWork($url){
	
	$headers = getHeader($url);
	if(substr($headers,0,4)=='HTTP'){
		$r_val= TRUE;
	}else{//UNABLE TO OPEN VIA HEADERS TRY TO OPEN FILE USING FOPEN
		if(doesFileExisit($url))
			$r_val= TRUE;
		else
			$r_val= FALSE;		
	}
	
	return $r_val;	
	
}

doesLinkWork('any url');
?>

Posted: Sun Aug 15, 2004 10:44 pm
by Buddha443556
You don't need the entire page just the response code so... First try to use a range parameter (CURLOPT_RANGE) with a GET request to retrieve a small number of bytes. Or try using a HEAD request to retrieve just the headers but I've found some servers don't respond properly to such a requests.

Don't forget just because a URL failed once does not mean it's bad. So some means to know how many times a URL failed maybe helpful.