PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
Moderator: General Moderators
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Nov 20, 2006 12:02 am
Dear all,
I'm working on pdf indexing and came across a small problem regarding the relative path.
Suppose I have URL which I am scanning is
http://xyz.com/docs/manaul/main.html .
When scannng this files if it has links like
Code: Select all
<a href="../pdfs/manual_en.pdf">English Manual </a>
I want to convert this realtive href to full URL like
http://xyz.com/docs/pdfs/manual_en.pdf
Please suggest. If this is not clear, I'll try to write more.
With Best Regards,
Dibyendra
Last edited by
dibyendrah on Mon Nov 20, 2006 1:28 am, edited 1 time in total.
John Cartwright
Site Admin
Posts: 11470 Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:
Post
by John Cartwright » Mon Nov 20, 2006 12:12 am
a simple str_replace() or preferably preg_match_all() to be safe incase there is a random ".." somewhere and simply to replace the relative path with the full path.
Heres a start, I suck very much at regex, hopefully this will inspire you.
Code: Select all
preg_replace('/a href="[\.]{1,2}\/([^"])/i', 'a href="http://domain.com/\\1', $file);
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Nov 20, 2006 1:37 am
I'm thinking to make function which will take a full url, relative path of file on that full url as parameters and traverse the full url comparing with relative path and return the actual full url to the file.
proposed function will be :
Code: Select all
function get_actual_url('http://xyz.com/a/b/c/d/d.html', '../../pdfs/b.pdf'){
.....
return actual_url_of_file
}
Any help will be appreciated.
Dibyendra
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Nov 20, 2006 3:54 am
Dear All,
Finally, I came across this solution in long hard way. Hope somebody can post easier way to solve this problem.
Code: Select all
<?php
function get_actual_URL($URL, $relative_path_to_URL){
$URL = str_replace(basename($URL), "", $URL);
$URL_array = parse_url($URL);
$scheme = $URL_array["scheme"];
$domain = $URL_array["host"];
$directory_structure = $URL_array['path'];
$directory_structure_parts = explode("/", $directory_structure);
foreach ($directory_structure_parts as $key=>$value) {
if(empty($directory_structure_parts[$key])){
unset($directory_structure_parts[$key]);
}
}
$count_main_URL_dirs = count($directory_structure_parts);
$file_URL_array = explode("/", $relative_path_to_URL);
$array_value_count = array_count_values($file_URL_array); //to recreate the array
$count_relative_path_dirs = $array_value_count[".."];
$directory_structure_parts = array_values($directory_structure_parts);
for($i=1; $i<=$count_relative_path_dirs; $i++){
unset($directory_structure_parts[$count_main_URL_dirs-$i]);
}
$directory_structure_remake = implode("/", $directory_structure_parts);
$append = str_replace("../", "", $relative_path_to_URL);
$full_url = $scheme."://".$domain."/".$directory_structure_remake."/".$append;
return ($full_url);
}
$URL = "http://xyz.com/a/b/c/d/d.html";
$relative_path_to_URL = "../../pdfs/b.pdf";
print get_actual_URL($URL, $relative_path_to_URL);
?>
Output :
With Regards,
Dibyendra
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Nov 20, 2006 6:06 am
This solution applies if you are reading the page from different domain rather than staying on same domain. If you are staying on same domain, realpath will help you out.
Hope this will help somebody.
With Best Regards,
Dibyendra
Ollie Saunders
DevNet Master
Posts: 3179 Joined: Tue May 24, 2005 6:01 pm
Location: UK
Post
by Ollie Saunders » Mon Nov 20, 2006 7:18 am
I haven't examined that code you just posted dibyendrah, but this is something I wrote to do pretty much exactly what you are after:
You might want to start reading from the row of asterisks I've put in for you:
Code: Select all
/**
* Preform a HTTP(S) redirect. Supports relative or absolute path.
*
* Portability features untested.
*
* @param string $input
* @todo Modify this so that it is testible
*/
function OSIS_Redirect($input)
{
static $protocol = null;
if ($protocol === null) {
if (empty($_SERVER['HTTPS']) || strtolower($_SERVER['HTTPS']) == 'off') {
$protocol = 'http';
} else {
$protocol = 'https';
}
}
static $port = null;
if ($port === null) {
$ports = array('https' => 443, 'http' => 80);
if ($_SERVER['SERVER_PORT'] != $ports[$protocol]) {
$port = ':' . $_SERVER['SERVER_PORT'];
} else {
$port = '';
}
}
/* **************************** */
if ($input[0] == '/') { // absolute
$path = array('');
} else { // relative
$path = explode('/', dirname($_SERVER['SCRIPT_NAME']));
}
$input = explode('/', $input);
foreach ($input as $part) { // resolve to absolute
if ($part == '.' || $part == '') {
continue;
}
if ($part == '..') {
array_pop($path);
continue;
}
$path[] = $part;
}
$path = implode('/', $path);
header("Location: $protocol://{$_SERVER['HTTP_HOST']}$port$path");
exit;
}
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Tue Nov 21, 2006 12:35 am
Thank you so much ole for your response. I have tested your script and help me for similar purpose.
I have designed many functions for something which to convert relative path to absoulte path and absoulute path to URL & URL to absoulute path. All these are may not work perfectly as these are not so robust and may fail if given different parameters. Hope these will be useful as welll who are looking for similar solutions.
Code: Select all
//string function conv_relative_path_to_absoulute_path(string $relative_path)
//to return the absolute path from the relative path provided as parameter
function conv_relative_path_to_absoulute_path($relative_path)
{
$realpath = realpath($relative_path);
if(file_exists($realpath)){
$absolute_path = str_replace($_SERVER['DOCUMENT_ROOT'], '', $realpath);
if ($absolute_path == $realpath) {
$server_root = str_replace('/', '\\', $_SERVER['DOCUMENT_ROOT']);
$absolute_path = str_replace($server_root, '', $realpath);
$absolute_path = str_replace("\\", "/", $absolute_path);
}
return $absolute_path;
}
}
//string function conv_absoulute_path_to_url(string $absolute_path)
//to return the full url of the absolute path provided as parameter
function conv_absoulute_path_to_url($absolute_path){
$server_name = $_SERVER['SERVER_NAME'];
$full_url = $server_name . "/" . $absolute_path;
$full_url = "http://" . $full_url;
$full_url = str_replace("//", "/", $full_url);
return($full_url);
}
function conv_url_to_absolute_path($url){
$url_parts = parse_url($url);
$domain_url = $url_parts["host"];
$path = $url_parts["path"];
$OS = getUserOS();
if ($OS == "Windows") {
$real_path = str_replace("\\", "/", $_SERVER['DOCUMENT_ROOT'] . $path);
} elseif ($OS == "Linux") {
$real_path = $_SERVER['DOCUMENT_ROOT'] . $path;
}
$real_path = str_replace("//", "/", $real_path);
if(file_exists($real_path)){
return($real_path);
}else{
return(null);
}
}
Cheers,
Dibyendra
Ollie Saunders
DevNet Master
Posts: 3179 Joined: Tue May 24, 2005 6:01 pm
Location: UK
Post
by Ollie Saunders » Tue Nov 21, 2006 2:47 am
Cool, I'm glad it helped.
To avoid conditions like if (windows) and if (linux) use the constant DIRECTORY_SEPARATOR which always contains the correct slash for the current OS.
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Tue Nov 21, 2006 3:34 am
ole wrote: Cool, I'm glad it helped.
To avoid conditions like if (windows) and if (linux) use the constant DIRECTORY_SEPARATOR which always contains the correct slash for the current OS.
Please provide some samples which will make clear for your comments.
Thank you.
With Best Regards,
Dibyendra