Page 1 of 1

Complicated URL completion...

Posted: Thu May 26, 2005 9:46 am
by Todd_Z
I have an array of urls, which i parse from any given site. Here are a few of the examples from my site:

Code: Select all

?page=Review&movie=Happy Together
mailto:Tuna@acdrifter.com
/index.php?start_from=6&ucat=&archive=&subaction=&id=&
http://www.mozilla.org/products/firefox/start/
Then I have a variable which could be in any of the following formats:

Code: Select all

www.acdrifter.com/
www.acdrifter.com
http://www.acdrifter.com/
http://www.acdrifter.com
acdrifter.com/
acdrifter.com
any of the above with something like ?page=this page, or gallery/ at the end, etc.
First step:
Transform all of the above url formats into http://www.acdrifter.com

Code: Select all

function createProperURL ( $link ) {
	$regexFullURL = &quote;#(?:(http://))?(?:(www\.))?(\w+)(?:(\.їa-z]+))?(?:(/))?(?:(.+))?#i&quote;;
	preg_match( $regexFullURL, $link, $parts );
	// No domain or extension (.com,.net,etc.)
	if ( !isset($partsї3]) || !isset($partsї4]) )
		return false;
	$newUrl = &quote;http://www.{$partsї3]}{$partsї4]}&quote;;
	// This is for the post domain stuff, i.e. index.php?page=News
	if ( isset( $partsї6] ) )
		$newUrl .= &quote;/{$partsї6]}&quote;;
	// Deal with subdomains
	if ( preg_match( &quote;#http://www\.(\w+)\.(\w+)/(\..+)#i&quote;, $newUrl, $parts ) )
		$newUrl = &quote;http://www.{$partsї1]}.{$partsї2]}{$partsї3]}&quote;;
	return $newUrl;
}
Second:
convert those links above into:

Code: Select all

http://www.acdrifter.com/?page=Review&movie=Happy Together
mailto:Tuna@acdrifter.com
http://www.acdrifter.com/index.php?start_from=6&ucat=&archive=&subaction=&id=&
http://www.mozilla.org/products/firefox/start/
Thanks in advance!

Posted: Thu May 26, 2005 1:43 pm
by Chris Corbyn
Lucky for you I had to do something very similar a while back.

I've rewritten some code I had for that project into a function for you here. It should get you started :)

Code: Select all

<?php

function parse_absolute_url($base_url, $link) {

	$base_path = $base_url;
	
	//Format the base url given to be a *valid* url
	if (!preg_match('/^\w+\:\/\//', $base_path)) {
		$base_path = 'http://'.$base_path; //Assume it's http
	} //End if
	
	if (preg_match('/^\w+\:\/\/[^\/]*$/', $base_path)) {
		$base_path .= '/'; //Append / as the root path
	} //End if
	
	//Decide what the link looks like and parse it to the domain URL
	if (preg_match('/^\w+\:\/\//', $link)) {						//Absolute URL
		$url = $link;
	} elseif (preg_match('/^\//', $link)) {							//Root path
		preg_match('/^(\w+\:\/\/[^\/]+)/', $base_path, $url_match);
		$url = $url_match[1].$link;
	} elseif (preg_match('/^\?/', $link)) {							//Query string
		preg_match('/^(\w+\:\/\/[^\?]+)/', $base_path, $url_match);
		$url = $url_match[1].$link;
	} elseif (preg_match('/^\#/', $link)) {							//Anchor
		preg_match('/^(\w+\:\/\/[^\#]+)/', $base_path, $url_match);
		$url = $url_match[1].$link;
	} elseif (preg_match('/(?:\.\/)?(.*)$/', $link, $link_bits)) {
		preg_match('/(\w+\:\/\/.*)\//', $base_path, $url_match);		//Relative Path
		$url = $url_match[1].'/'.$link_bits[1];
	} //End if
	
	return $url;
	
}

/* EXAMPLES */

echo parse_absolute_url('google.com/foobar/file.php', '#foobar'); // http://google.com/foobar/file.php#foobar

echo parse_absolute_url('localhost', '?foo=bar&jack=bill'); // http://localhost/?foo=bar&jack=bill

echo parse_absolute_url('http://www.yoursite.com/somefolder/somefile.php?query=values#anchor_here', '/index.php'); // http://www.yoursite.com/index.php

echo parse_absolute_url('https://acb123.co.uk/foobar', './nextdir/file.php'); // https://acb123.co.uk/foobar/nextdir/file.php

echo parse_absolute_url('localhost/dir/file.html', '#top'); //  http://localhost/dir/file.html#top
				
?>

Posted: Sun Sep 10, 2006 6:59 pm
by Shendemiar
Hey! What a cool util !