Page 1 of 1

[SOLVED] take folder list in a host to an array

Posted: Thu Oct 07, 2004 3:09 am
by mudkicker

Code: Select all

<?php
class mudSearch {
	public $q;
	public $files;
	public $folders;

	public function __construct() {
		// Check if query is posted via GET method.
		$this->openDirectory($_SERVER['DOCUMENT_ROOT']);
		// Create an array for whole files.
		$this->files = array();
		$this->folders = array();
		$config = & new Config();
		// Connect to database
		@mysql_connect($config->hostname, $config->sqluser, $config->sqlpass) or die("Cannot connect to database host\n");
		@mysql_select_db($config->db) or die("Cannot select the database\n");
	}

	public function openDirectory($fld) {
		if ($handle = opendir($fld)) {
			while ($file = readdir($handle)) {
				if ($file != "." && $file != "..") {
					if (is_dir($file)) {
						$this->folders[] = $file;
					}
					else {
						$this->files[] = $file;
					}
				}
			}
			closedir($handle);
		}
	}

	public function openAllDirectories() {
		foreach ($this->folders as $folder) {
			$this->openDirectory($folder);
		}
	}

?>
this won't take all the folder structure into an array. i am doing a logical mistake but where??...

i just want to take whole folder structure into an array (with their path to the root)..

Posted: Thu Oct 07, 2004 3:37 am
by twigletmac
What kind of result do you get?

Mac

Posted: Thu Oct 07, 2004 3:57 am
by mudkicker
Actually I think this code is totally wrong and want to start all over again. How can I take all folder/file structure of a host in two arrays:
if it's a folder : then to folder array
if it's a file : thne to a file array.

i get confused and a little tip/hint would help me so much!!

Posted: Thu Oct 07, 2004 4:02 am
by twigletmac
When you do is_dir() etc. you need to give the full path to the file, that's probably where your function is falling down. FYI, here's my version dug out of our function library at work:

Code: Select all

/**
 * Create an array of all the contents of a directory.
 *
 * Files and folders are saved into separate elements in the array.
 * @since     1.0 <13/10/2003>
 * @version   1.0 <13/10/2003>
 * @param     string   $directory   The directory to gather the list of
 *                                  contents from.
 * @param     int      $types       What to return: 0 (default) = files 
 *                                  and folders; 1 = files only; and 2 =
 *                                  folders only.
 * @return    array    The files and folders within the directory.
 */
function dir_get_contents($directory, $types=0)
{
	$types   = (int)$types;
	$folders = array();
	$files   = array();

	if ($dir = @opendir($directory)) {
		while (($file = readdir($dir)) !== false) {
			
			if ($file != '.' && $file != '..') {
				if (is_dir($directory.'/'.$file)) {
					$folders[] = $file;
				} elseif (is_file($directory.'/'.$file)) {
					$files[] = $file;
				}
			}

		}

		closedir($dir);
	}

	sort($folders);
	sort($files);

	if ($types == 0) {
		$result = array(
			'dirs'  => $folders,
			'files' => $files,
		);
	} elseif ($types == 1) {
		$result = $files;
	} elseif ($types == 2) {
		$result = $folders;
	} else {
		$result = array();
	}

	return $result;
} // end func
Mac

Posted: Thu Oct 07, 2004 4:19 am
by mudkicker
Well, thank you but how can I automatically get all folders within a function. i mean

i can use this fuınction for a folder but how can i automatically use this function to folders within this folder and folder in these 2nd. folders? ;)
i hope you understand my problem, thanks for your help.

Posted: Thu Oct 07, 2004 4:26 am
by twigletmac
Aah, make it recursive - get it to call itself, pass the arrays by reference and watch the fun commence - basically each time the function finds a folder run the function again on that folder. My brain's fuzzy this morning and thus I am seeing problems that don't exist and missing ones that do :oops:

Mac

Posted: Thu Oct 07, 2004 4:27 am
by mudkicker
how can I do this (i mean makin recursive) is there any simple example for it?

Posted: Thu Oct 07, 2004 5:01 am
by twigletmac
Quick example of a recursive function:

Code: Select all

<?php

function recursive(&$text)
{
	$text .= strlen($text);

	if (strlen($text) < 10) {
		$text = recursive($text);
	}

	return $text;
} // end func

$text = 0;
echo recursive($text);
?>
Mac

Posted: Thu Oct 07, 2004 5:51 am
by mudkicker

Code: Select all

<?
error_reporting(E_ALL);
require("config.php");

class mudSearch {
	public $q;
	public $dizin;
	public $filepath;

	public function __construct() {
		// Check if query is posted via GET method.
		$this->dizin = "..";
		$this->openDirectory($this->dizin);
		// Get configuration from config.php
		$config = new Config();
		// Connect to database
		@mysql_connect($config->hostname, $config->sqluser, $config->sqlpass) or die("Cannot connect to database host\n");
		@mysql_select_db($config->db) or die("Cannot select the database\n");
	}

	public function openDirectory($fld) {
		if ($handle = opendir($fld)) {
			while ($file = readdir($handle)) {
				// except "." and ".." open whole stuff
				if ($file != "." && $file != "..") {
					// if the "one" is a folder open it and run this function again for it...
					if (is_dir($file)) {
						$this->dizin .= "/".$file;
						$this->openDirectory($this->dizin);
					}
					// ...or insert the file to an array for insertind to database in further steps.
					else {
						$folderpath = getcwd();
						$this->getFileList($this->dizin."/".$file);
					}
				}
			}
			closedir($handle);
		}
	}

	public function getFileList($fl) {
		if ($this->crawlPage($fl)) {
			echo "File Crawled: ".$fl;
			echo "<br>";
		}
		else {
			echo "File <span style="color: red;">NOT</span> Crawled: ".$fl;
			echo "<br>";
		}
	}


	public function crawlPage($filename) {
		// Crawl pages and keywords in it.
		$keyword_string = "";
		$file_keywords = array_unique(str_word_count(strip_tags(file_get_contents($filename)), 1));
		foreach ($file_keywords as $keyword) {
			$keyword_string .= $keyword." ";
		}
		// Insert the crawled pages to the database.
		$crawl_insert = @mysql_query("INSERT INTO mudcrawls(filename, fileurl, keywords) VALUES ('".basename($filename)."', '".$filename."', '".$keyword_string."')");
		if($crawl_insert) {
			return true;
		}
		else {
			return false;
		}
	}
}

// test
$arif = new mudSearch();
?>
this is my code and this is the output :

Code: Select all

File NOT Crawled: ../dossier.gif

Warning: file_get_contents(../exemples) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../exemples

Warning: file_get_contents(../eylul) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../eylul

Warning: file_get_contents(../eylul_new) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../eylul_new
File NOT Crawled: ../eylul_new.zip
File NOT Crawled: ../index.php
File NOT Crawled: ../logo.gif

Warning: file_get_contents(../magma) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../magma

Warning: file_get_contents(../mud) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../mud

Warning: file_get_contents(../mudsearch) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../mudsearch

Warning: file_get_contents(../phpdocumentor) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../phpdocumentor

Warning: file_get_contents(../phpmyadmin) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../phpmyadmin

Warning: file_get_contents(../remote_files) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../remote_files

Warning: file_get_contents(../rgallery) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../rgallery

Warning: file_get_contents(../sqlitemanager) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../sqlitemanager

Warning: file_get_contents(../test) &#1111;function.file-get-contents]: failed to open stream: Permission denied in c:\wamp\www\mudsearch\mudsearch.php on line 57
File NOT Crawled: ../test
what the hell is wrong with this??? :( :( :(

Posted: Thu Oct 07, 2004 6:03 am
by twigletmac
Just tested a slightly hacked version of your code on my server - you can't run file_get_contents() on a folder so you have to check for that first.

Mac

Posted: Thu Oct 07, 2004 6:27 am
by mudkicker
well
i did some coding and get lately this far.. it opens the folders very well but not crawl!!...
it always get the error: "File NOT Crawled";

this is the code:

Code: Select all

<?
error_reporting(E_ALL);
ini_set("max_execution_time",600);
require("config.php");

class mudSearch {
	public $q;
	public $dizin;
	public $filepath;
	public $foldersinthisfolder;

	public function __construct() {
		// Check if query is posted via GET method.
		$this->dizin = "..";
		$this->openDirectory($this->dizin);
		// Get configuration from config.php
		$config = new Config();
		// Connect to database
		mysql_connect($config->hostname, $config->sqluser, $config->sqlpass) or die("Cannot connect to database host\n");
		mysql_select_db($config->db) or die("Cannot select the database\n");
	}

	public function openDirectory($fld) {
		$this->dizin = $fld;
		if ($handle = opendir($this->dizin)) {
			while ($file = readdir($handle)) {
				// except "." and ".." open whole stuff
				if ($file != "." && $file != "..") {
					$path = $this->dizin."/".$file;
					// if the "one" is a folder open it and run this function again for it...
					if (is_dir($path)) {
						$this->openDirectory($path);
						$tmp2 = explode("/",$path);
						$tmp1 = array_pop($tmp2);
						$path = implode("/",$tmp2);
						$this->dizin = $path;
					}
					// ...or insert the file to an array for insertind to database in further steps.
					else {
						$path_parts = pathinfo($path);
						if($path_parts["extension"] == "php" || $path_parts["extension"] == "html" || $path_parts["extension"] == "htm") {
							$this->getFileList($path);
						}
					}
				}
			}
			closedir($handle);
		}
	}

	public function getFileList($fl) {
		$a = $this->crawlPage($fl);
		if ($a) {
			echo "File Crawled: ".$fl;
			echo "<br>";
		}
		else {
			echo "File <span style="color: red;">NOT</span> Crawled: ".$fl;
			echo "<br>";
		}
	}


	public function crawlPage($filename) {
		// Crawl pages and keywords in it.
		$keyword_string = "";
		$file_keywords = array_unique(str_word_count(strip_tags(file_get_contents($filename)), 1));
		foreach ($file_keywords as $keyword) {
			$keyword_string .= $keyword." ";
		}
		// Insert the crawled pages to the database.
		$crawl_insert = mysql_query("INSERT INTO mudcrawls(filename, fileurl, keywords) VALUES ('".basename($filename)."', '".$filename."', '".$keyword_string."')");
		if($crawl_insert) {
			return true;
		}
		else {
			return false;
		}
	}
}

// test
$arif = new mudSearch();
?>

Posted: Thu Oct 07, 2004 6:48 am
by mudkicker
ok ok
i solved my problem ;) just stupid me.. two little mistakes and i'm fighting with my monitor and keyboard since 2 hours ;)

Posted: Thu Oct 07, 2004 6:52 am
by twigletmac
Out of interest, what was the solution - you've got a handy piece of code there that'll probably be copied a few times :) .

Mac

Posted: Thu Oct 07, 2004 8:17 am
by mudkicker
it was on __construct() i did the database connection after i call the funciton opendirectory ;)
ok here's my latest code:

Code: Select all

<?
class mudSearchSiteCrawler {
	// class variables
	public $dizin;

	public function __construct() {
		// Get configuration from config.php
		$config = new Config();
		// Connect to database
		@mysql_connect($config->hostname, $config->sqluser, $config->sqlpass) or die("Cannot connect to database host\n");
		@mysql_select_db($config->db) or die("Cannot select the database\n");
		// Check if query is posted via GET method.
		$this->dizin = "..";
		$this->openDirectory($this->dizin);
	}

	public function openDirectory($fld) {
		$this->dizin = $fld;
		$handle = opendir($this->dizin);
		if ($handle) {
			while ($file = readdir($handle)) {
				// except "." and ".." open whole stuff
				if ($file != "." && $file != "..") {
					$path = $this->dizin."/".$file;
					// if the "one" is a folder open it and run this function again for it...
					if (is_dir($path)) {
						$this->openDirectory($path);
						$tmp2 = explode("/",$path);
						array_pop($tmp2);
						$path = implode("/",$tmp2);
						$this->dizin = $path;
					}
					// ...or insert the file to an array for insertind to database in further steps.
					else {
						// we can check here the file type. this could be later with switch loop.
						$path_parts = pathinfo($path);
						if($path_parts["extension"] == "php" || $path_parts["extension"] == "html" || $path_parts["extension"] == "htm") {
							$this->getFileList($path);
						}
					}
				}
			}
			closedir($handle);
		}
	}

	public function getFileList($fl) {
		// Check if page is crawled and print output...
		if ($this->crawlPage($fl)) {
			echo "File Crawled: ".$fl;
			echo "<br>";
		}
		else {
			echo "File <span style="color: red;">NOT</span> Crawled: ".$fl;
			echo "<br>";
		}
	}


	public function crawlPage($filename) {
		// Crawl pages and keywords in it.
		$keyword_string = "";
		$file_keywords = array_unique(str_word_count(strip_tags(file_get_contents($filename)), 1));
		foreach ($file_keywords as $keyword) {
			$keyword_string .= $keyword." ";
		}
		// Insert the crawled pages to the database.
		$crawl_insert = mysql_query("INSERT INTO mudcrawl(filename, fileurl, keywords) VALUES ('".basename($filename)."', '".$filename."', '".addslashes($keyword_string)."')") or die("Cannot insert information to database for: ".$filename);
		// set return
		if($crawl_insert) {
			return true;
		}
		else {
			return false;
		}
	}
}
?>
this is for my search script. i will first crawl all pages and then search from database. i just started to write this script.