follow a url?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

follow a url?

Post by GeXus »

Is it posible with cURL or something else that would allow you to type in a URL and see all of the pages that url redirects to?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Yes. Can you be more specific?
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Well, basically like how tracking URLs will redirect several times before you actually get to the "real" URL... this would basically show each step or each page being accessed untill it reaches the "real" or last url without redirects.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

hmm.. this is kind of a long shot (I know feyd has something better in mind) but try

1. turning off followurl to cancel any auto redirects
2. record the header location
3. send a new request to that location
4. record any furthur headers sent
5. repeat until no location headers present

This is assuming they are not using meta redirects.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Jcart wrote:hmm.. this is kind of a long shot (I know feyd has something better in mind) but try

1. turning off followurl to cancel any auto redirects
2. record the header location
3. send a new request to that location
4. record any furthur headers sent
5. repeat until no location headers present

This is assuming they are not using meta redirects.
Hmm.. the only problem is that potentially the redirect url could be different if you hit the url exactly.. such as sending a new request to the new location....
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

no, it's not.
jamiel
Forum Contributor
Posts: 276
Joined: Wed Feb 22, 2006 5:17 am
Location: London, United Kingdom

Post by jamiel »

You can pass parameters and options to wget to get it to follow links and url's for specified domains.
User avatar
bokehman
Forum Regular
Posts: 509
Joined: Wed May 11, 2005 2:33 am
Location: Alicante (Spain)

Post by bokehman »

Sorry if this is a bit untidy but I only had 5 minutes to spare. Returns an array of URLs on success or false on failure.

Code: Select all

<?php

function list_redirects($url)
{
	static $location;
	$location = is_array($location) ? $location : array();
	$url_parsed = parse_url($url);
	extract($url_parsed);
	if (!@$scheme)
	{
		$url = 'http://'.$url;
		$url_parsed = parse_url($url);
	}
	$location[] = $url;
	extract($url_parsed);
	if(!@$port) $port = 80;
	if(!@$path) $path = '/';
	if(@$query) $path .= '?'.$query;
	$out = "HEAD $path HTTP/1.0\r\n";
	$out .= "Host: $host\r\n";
	$out .= "Connection: Close\r\n\r\n";
	if(!$fp = @fsockopen($host, $port, $es, $en, 5))
	{
		return false;
	}
	fwrite($fp, $out);
	while (!feof($fp)) 
	{
		$s = fgets($fp, 128);
		if(preg_match('/^Location:/', $s) !== 0)
		{
			fclose($fp);
			return list_redirects(trim(preg_replace("/Location:/i", "", $s)));
		}
		if(preg_match('/^HTTP(.*?)200/i', $s))
		{
			fclose($fp);
			return $location;
		}
	}
	fclose($fp);
	return false;
}
   
?>
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

bokehman wrote:Sorry if this is a bit untidy but I only had 5 minutes to spare. Returns an array of URLs on success or false on failure.

Code: Select all

<?php

function list_redirects($url)
{
	static $location;
	$location = is_array($location) ? $location : array();
	$url_parsed = parse_url($url);
	extract($url_parsed);
	if (!@$scheme)
	{
		$url = 'http://'.$url;
		$url_parsed = parse_url($url);
	}
	$location[] = $url;
	extract($url_parsed);
	if(!@$port) $port = 80;
	if(!@$path) $path = '/';
	if(@$query) $path .= '?'.$query;
	$out = "HEAD $path HTTP/1.0\r\n";
	$out .= "Host: $host\r\n";
	$out .= "Connection: Close\r\n\r\n";
	if(!$fp = @fsockopen($host, $port, $es, $en, 5))
	{
		return false;
	}
	fwrite($fp, $out);
	while (!feof($fp)) 
	{
		$s = fgets($fp, 128);
		if(preg_match('/^Location:/', $s) !== 0)
		{
			fclose($fp);
			return list_redirects(trim(preg_replace("/Location:/i", "", $s)));
		}
		if(preg_match('/^HTTP(.*?)200/i', $s))
		{
			fclose($fp);
			return $location;
		}
	}
	fclose($fp);
	return false;
}
   
?>

Nice! Thanks, I wasnt expecting all that.. seems to work great.... except it won't catch javascript redirects.. hmm.
Post Reply