Conversion of any relative path to the full URL

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Conversion of any relative path to the full URL

Post by dibyendrah »

Dear all,
I'm working on pdf indexing and came across a small problem regarding the relative path.

Suppose I have URL which I am scanning is http://xyz.com/docs/manaul/main.html .
When scannng this files if it has links like

Code: Select all

<a href="../pdfs/manual_en.pdf">English Manual </a>
I want to convert this realtive href to full URL like http://xyz.com/docs/pdfs/manual_en.pdf

Please suggest. If this is not clear, I'll try to write more.

With Best Regards,
Dibyendra
Last edited by dibyendrah on Mon Nov 20, 2006 1:28 am, edited 1 time in total.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

a simple str_replace() or preferably preg_match_all() to be safe incase there is a random ".." somewhere and simply to replace the relative path with the full path.

Heres a start, I suck very much at regex, hopefully this will inspire you.

Code: Select all

preg_replace('/a href="[\.]{1,2}\/([^"])/i', 'a href="http://domain.com/\\1', $file);
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

Thank you Jcart for your reply. The main problem is that if the URL is http://domain.com/1/2/3/4/main.html and if this file has relative path ../../somefile.pdf, the output should be http://domain.com/1/2/somefile.pdf. Is there other similar solution Jcart ?

Thank you for your quick response.

Dibyendra
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

I'm thinking to make function which will take a full url, relative path of file on that full url as parameters and traverse the full url comparing with relative path and return the actual full url to the file.

proposed function will be :

Code: Select all

function get_actual_url('http://xyz.com/a/b/c/d/d.html', '../../pdfs/b.pdf'){
.....
return actual_url_of_file
}
Any help will be appreciated.

Dibyendra
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

Dear All,
Finally, I came across this solution in long hard way. Hope somebody can post easier way to solve this problem.

Code: Select all

<?php

function get_actual_URL($URL, $relative_path_to_URL){ 
	
	$URL = str_replace(basename($URL), "", $URL);
	$URL_array = parse_url($URL);
	$scheme = $URL_array["scheme"];
	$domain = $URL_array["host"];
	
	$directory_structure = $URL_array['path'];
	$directory_structure_parts = explode("/", $directory_structure);

	foreach ($directory_structure_parts as $key=>$value) {
		if(empty($directory_structure_parts[$key])){
			unset($directory_structure_parts[$key]);	
		}
	}
	$count_main_URL_dirs = count($directory_structure_parts);
	
	$file_URL_array = explode("/", $relative_path_to_URL);
	$array_value_count = array_count_values($file_URL_array); //to recreate the array

	$count_relative_path_dirs = $array_value_count[".."];
	$directory_structure_parts = array_values($directory_structure_parts);
	
	for($i=1; $i<=$count_relative_path_dirs; $i++){
		unset($directory_structure_parts[$count_main_URL_dirs-$i]);
	}

	$directory_structure_remake = implode("/", $directory_structure_parts);
	$append = str_replace("../", "", $relative_path_to_URL);
	$full_url = $scheme."://".$domain."/".$directory_structure_remake."/".$append;

	return ($full_url);
	
}

$URL = "http://xyz.com/a/b/c/d/d.html";
$relative_path_to_URL = "../../pdfs/b.pdf";

print get_actual_URL($URL, $relative_path_to_URL);
?>
Output :

Code: Select all

http://xyz.com/a/b/pdfs/b.pdf
With Regards,
Dibyendra
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

This solution applies if you are reading the page from different domain rather than staying on same domain. If you are staying on same domain, realpath will help you out.

Hope this will help somebody.

With Best Regards,
Dibyendra
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

I haven't examined that code you just posted dibyendrah, but this is something I wrote to do pretty much exactly what you are after:

You might want to start reading from the row of asterisks I've put in for you:

Code: Select all

/**
 * Preform a HTTP(S) redirect. Supports relative or absolute path.
 *
 * Portability features untested.
 *
 * @param string $input
 * @todo Modify this so that it is testible
 */
function OSIS_Redirect($input)
{
    static $protocol = null;
    if ($protocol === null) {
        if (empty($_SERVER['HTTPS']) || strtolower($_SERVER['HTTPS']) == 'off') {
            $protocol = 'http';
        } else {
            $protocol = 'https';
        }
    }
    static $port = null;
    if ($port === null) {
        $ports = array('https' => 443, 'http' => 80);
        if ($_SERVER['SERVER_PORT'] != $ports[$protocol]) {
            $port = ':' . $_SERVER['SERVER_PORT'];
        } else {
            $port = '';
        }
    }
    /* **************************** */
    if ($input[0] == '/') { // absolute
        $path = array('');
    } else { // relative
        $path = explode('/', dirname($_SERVER['SCRIPT_NAME']));
    }
    $input = explode('/', $input);

    foreach ($input as $part) { // resolve to absolute
        if ($part == '.' || $part == '') {
            continue;
        }
        if ($part == '..') {
            array_pop($path);
            continue;
        }
        $path[] = $part;
    }

    $path = implode('/', $path);
    header("Location: $protocol://{$_SERVER['HTTP_HOST']}$port$path");
    exit;
}
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

Thank you so much ole for your response. I have tested your script and help me for similar purpose.

I have designed many functions for something which to convert relative path to absoulte path and absoulute path to URL & URL to absoulute path. All these are may not work perfectly as these are not so robust and may fail if given different parameters. Hope these will be useful as welll who are looking for similar solutions.

Code: Select all

//string function conv_relative_path_to_absoulute_path(string $relative_path)
//to return the absolute path from the relative path provided as parameter
function conv_relative_path_to_absoulute_path($relative_path)
{
	$realpath = realpath($relative_path);

	if(file_exists($realpath)){
		$absolute_path = str_replace($_SERVER['DOCUMENT_ROOT'], '', $realpath);
		if ($absolute_path == $realpath) {
			$server_root = str_replace('/', '\\', $_SERVER['DOCUMENT_ROOT']);
			$absolute_path = str_replace($server_root, '', $realpath);
			$absolute_path = str_replace("\\", "/", $absolute_path);
		}
		return $absolute_path;
	}

}

//string function conv_absoulute_path_to_url(string $absolute_path)
//to return the full url of the absolute path provided as parameter
function conv_absoulute_path_to_url($absolute_path){

	$server_name = $_SERVER['SERVER_NAME'];
	$full_url = $server_name . "/" . $absolute_path;
	$full_url = "http://" . $full_url;
	$full_url = str_replace("//", "/", $full_url);
	return($full_url);
}

function conv_url_to_absolute_path($url){

	$url_parts = parse_url($url);
	$domain_url = $url_parts["host"];
	$path = $url_parts["path"];
	$OS = getUserOS();
	if ($OS == "Windows") {
		$real_path = str_replace("\\", "/", $_SERVER['DOCUMENT_ROOT'] . $path);
	} elseif ($OS == "Linux") {
		$real_path = $_SERVER['DOCUMENT_ROOT'] . $path;
	}
	$real_path = str_replace("//", "/", $real_path);

	if(file_exists($real_path)){
		return($real_path);
	}else{
		return(null);
	}
}

Cheers,
Dibyendra
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

Cool, I'm glad it helped.

To avoid conditions like if (windows) and if (linux) use the constant DIRECTORY_SEPARATOR which always contains the correct slash for the current OS.
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Post by dibyendrah »

ole wrote:Cool, I'm glad it helped.

To avoid conditions like if (windows) and if (linux) use the constant DIRECTORY_SEPARATOR which always contains the correct slash for the current OS.
Please provide some samples which will make clear for your comments.

Thank you.

With Best Regards,
Dibyendra
Post Reply