Sitemap PHP problem

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
philcheese
Forum Newbie
Posts: 2
Joined: Sat Oct 13, 2007 2:12 am

Sitemap PHP problem

Post by philcheese »

Hi guys,

My first post here :) !

First of all I'm not the greatest programmer in the world, I usually use already written code and modify it for my needs. I'm a lone design freelancer (but don't hold that against me!) so mostly don't have the time to learn but find I'm pretty good at working out how things work and altering to suit. However...

I am building a site for a client which I'm implementing a PHP sitemap for so no-one (me or otherwise) has to mess about adding to it at any time. The code I'm using is this (I've taken out the comments and copyright notice to make the code more easily readable, I never use other people's code without a credit):

Code: Select all

<?php

$imgpath="";

$types=array(
	".html",
	".htm",
	".shtm",
	".sthml"
);

$htmltypes=array(
	".php",
	".html",
	".htm",
	".shtm",
	".sthml",
);

$ignore=array(
	".htaccess",
	"cgi-bin",
	"images",
	"index.htm",
	"index.html",
	"index.php",
  	"robots.txt",
        "search",
        "css",
	"media",
	"documents"
);

/*
==============================================
You should not need to make changes below here
==============================================
*/
$id=0;
echo "<div id=\"sitemap\"><ul>\n";
$id++;
$divs="";
if(substr($startin,strlen($startin)-1,1)=="/")
	$startin=trim($startin,"/");
foreach($types as $type){
	if (file_exists($_SERVER['DOCUMENT_ROOT']."$startin/index$type")){
		$index=$_SERVER['DOCUMENT_ROOT']."$startin"."/index$type";
		break;
	}
}
$types=join($types,"|");
$types="($types)";
if(!is_array($htmltypes))
	$htmltypes=array();
if(count($htmltypes)==0)
	$htmltypes=$types;
if(!$imgpath)
	$imgpath=".";
echo "<li><a href=\"$startin/\">".getTitle($index)."</a>\n";
showlist($_SERVER['DOCUMENT_ROOT']."$startin");
echo "</li></ul></div>\n";
if (is_array($divs)){
	$divs="'".join($divs,"','")."'";
	}


function showlist($path){
	global $ignore, $id, $divs, $imgpath, $types, $startin;
	$dirs=array();
	$files=array();
	if(is_dir($path)){
		if ($dir = @opendir($path)) {
			while (($file = readdir($dir)) !== false) {
				if ($file!="." && $file!=".." && !in_array($file,$ignore)){
					if (is_dir("$path/$file")){
						if (file_exists("$path/$file/index.php"))
							$dirs[$file]=getTitle("$path/$file/index.php");
						elseif (file_exists("$path/$file/index.html"))
							$dirs[$file]=getTitle("$path/$file/index.html");
						elseif (file_exists("$path/$file/index.htm"))
							$dirs[$file]=getTitle("$path/$file/index.htm");
						else
							$dirs[$file]=$file;
					} else {
						if (ereg("$types$", $file)){
							$files[$file]=getTitle("$path/$file");
							if (strlen($files[$file])==0)
								$files[$file]=$file;
						}
					}
				}
		  }  
		  closedir($dir);
		}
		natcasesort($dirs);
		$url=str_replace($_SERVER['DOCUMENT_ROOT'],"",$path);
		$n=substr_count("$url/$","/");
		$base=substr_count($startin,"/")+1;
		$indent=str_pad("",$n-1,"\t");
		if ($n>$base)
		echo "<ul>\n";
		foreach($dirs as $d=>$t){
			echo "<li>";
			echo "<a href=\"$url/$d/\">$t</a>\n";
			showlist("$path/$d");
			echo "</li>\n";
		}
		natcasesort($files);
		foreach($files as $f=>$t){
			echo "<li><a href=\"$url/$f\">$t</a></li>\n";
		}
		echo "</ul>\n";
	}
}

function getTitle($file){
	global $htmltypes;
	$title="";
	$p=pathinfo($file);
	if(!in_array(strtolower($p['extension']),$htmltypes)){
		$f=file_get_contents($file);
		if(preg_match("'<title>(.+)</title>'i", $f, $matches)){
			$title=$matches[1];
		}
	}
	$title=$title?$title:basename($file);
	return htmlentities(trim(strip_tags($title)));
}
?>
The output of the code is fine until the showList function encounters a folder with an index.html file in it but no other files and it prints a blank "<ul></ul>". I need to have some single index files in folders to futureproof the site as the client will be adding to it. Is there a simple solution or should I just ditch the code and use something more suitable? Any suggestions? I can only seem to find XML ones for submitting to Google.

You can view the output of the code here: http://www.intercambio.info/new_site/sitemap.php

I realise this is probably a numbtee question so sorry about that. Any help greatly appreciated.

Thanks a lot,

Phil.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

showList() is set to ignore the index files via $ignore.

Also, it would appear you have some odd logic in your <ul> generation code. Specifically, you conditionally output <ul>, but unconditionally output </ul>
philcheese
Forum Newbie
Posts: 2
Joined: Sat Oct 13, 2007 2:12 am

Post by philcheese »

Thanks for your response. I know about the $ignore thing. I've set it now to not ignore the indexes (which I suppose is a rather inelegant halfway house: http://www.intercambio.info/new_site/sitemap.php) but what I would really like is to get it to ignore the indexes and so just put the link on the folder names if there were no other files in the folder (and not include that extra <ul> which obviously won't W3C validate). Any suggestions as to how I can get the code not to print that extra <ul> tag?

Thanks in advance. :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I suspect if you have the function look at the length of the array (count()) that may give you the ability to detect when it is just the index file. (I suspect the array would be empty.)
Post Reply