Page 1 of 2

Slow search code

Posted: Fri Jul 10, 2009 1:57 am
by jvnane
I'm trying to make this search function on this intranet site we have at work. It's supposed to search all the file names in our databases for whatever the user puts in. The code I have created works, but it takes forever to fully execute... I haven't seen it finish when searching on the actual database, but i think it would take at least 10 minutes. Any suggestions on speeding this up would be much appreciated. Also I am pretty new to php, just started using it a few weeks ago, so I might be unfamiliar with a lot of terms etc.

<?php
class Search
{
var $original_path;
var $original_dir;
var $match_files = array();
var $dirs = array();
var $files = array();
var $file_search;

function start($file_search,$path)
{
$this->file_search = $file_search; //string that the user put in to search for
$this->original_path = $path; //directory to start searching in (string)
$this->original_dir = opendir($this->original_path);
$paths = $this->getNames($this->original_dir,$this->original_path);
$this->scanPaths($this->original_dir,$paths);
for($x=0;$x<count($this->files);$x++)
{
$this->doesItMatch($file_search,$this->files[$x]);
}
return $this->match_files;
}

///////////////////////////////////
//This function gets the path names of all files and folders in the current directory
//then stores the names to an array.
///////////////////////////////////
function getNames($dir,$path)
{
$paths = array();
for($file = readdir($dir); $file != false; $file = readdir($dir))
{
if(!($this->isItDot($file)))
{
$paths[count($paths)] = $path."/".$file;
}
}
return $paths;
}

///////////////////////////////////
//For every file/folder in the current directory, evaluate it. If it is a directory,
//open it up and get the names again and keep evaluating.
///////////////////////////////////
function scanPaths($dir,$paths)
{
for($x = 0; $x < count($paths); $x++)
{
$it_is_dir = $this->isItDir($paths[$x]);
if($it_is_dir)
{
$nextfolder = $paths[$x].$file;
$nextdir = opendir($nextfolder);//$this->dirs[count($this->dirs)] = opendir($nextpath);
$nextpaths = $this->getNames($nextdir,$nextfolder);
$this->scanPaths($nextdir,$nextpaths);
}
elseif(!$it_is_dir)
{
$this->files[count($this->files)] = $paths[$x];
echo "$file";
}
}
}

///////////////////////////////////
//Is $file_search in the file name? if so store it to the matched files array.
///////////////////////////////////
function doesItMatch($file_search,$path)
{
$pos = strrpos($path,"/");
$file = substr($path,$pos);
if(eregi($file_search,$file))
{
$this->match_files[count($this->match_files)] = $path;
}
}

function isItDir($path)
{
if(filetype($path) == dir)
{
return true;
}
else {return false;}
}

function isItDot($file)
{
if(strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0)
{return true;}
else {return false;}
}
}
?>

Re: Slow search code

Posted: Fri Jul 10, 2009 3:24 am
by VladSun
First, you might be interested with some issues when iterating recursively directories:
viewtopic.php?f=1&t=73749

You may try to make a system call to the OS -

Linux

Code: Select all

$matches = shell_exec('ls -R '.escapeshellcmd($path).' | grep -i "'.escapeshellcmd($file_search).'"')
Windows

Code: Select all

$files = shell_exec('dir /S '.escapeshellcmd($path))
then search $files for matches.

There is also another option - scan your dirs by using a cron job and insert/update the results into a database. This way you will have very fast DB search, although it is not a real time one - i.e. there might be files delete/added after the last cron job.
You may also use a flat file to store the scan results.

PS:
You can simplify some of your functions and make them more readable like this:

Code: Select all

function isItDir($path)
{
    return (filetype($path) == dir);
}
 
function isItDot($file)
{
    return (strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0);
}

Re: Slow search code

Posted: Fri Jul 10, 2009 3:30 am
by VladSun
If it is Linux OS, you can use the inotify filesystem feature:
http://www.ibm.com/developerworks/linux ... otify.html

to update your DB in realtime.

Re: Slow search code

Posted: Fri Jul 10, 2009 4:03 am
by jvnane
VladSun wrote:First, you might be interested with some issues hwen iterateing recursively directories:
viewtopic.php?f=1&t=73749

You may try to make a system call to the OS -

Linux

Code: Select all

$matches = shell_exec('ls -R '.escapeshellcmd($path).' | grep -i "'.escapeshellcmd($file_search).'"')
Windows

Code: Select all

$files = shell_exec('dir /S '.escapeshellcmd($path))
then search $files for matches.

There is also another option - scan your dirs by using a cron job and insert/update the results into a database. This way you will have very fast DB search, although it is not a real time one - i.e. there might be files delete/added after the last cron job.
You may also use a flat file to store the scan results.

PS:
You can simplify some of your functions and make them more readable like this:

Code: Select all

function isItDir($path)
{
    return (filetype($path) == dir);
}
 
function isItDot($file)
{
    return (strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0);
}
Thanks for the tips, however I don't know what a cron job is... Would this be like making a database table with all the file and path names that could be searched for, and updating this table every so often?

Also what does it mean to make a system call to the OS? How would this help?

Re: Slow search code

Posted: Fri Jul 10, 2009 4:20 am
by VladSun
jvnane wrote:however I don't know what a cron job is... Would this be like making a database table with all the file and path names that could be searched for, and updating this table every so often?
Cron job (Linux) is a scheduled task (windows).
jvnane wrote:Also what does it mean to make a system call to the OS? How would this help?
Try running the commands I gave you in the command prompt and you will see what I mean - instead of implementing them in PHP, use the already existing tools.

Re: Slow search code

Posted: Fri Jul 10, 2009 4:59 am
by jvnane
VladSun wrote:Try running the commands I gave you in the command prompt and you will see what I mean - instead of implementing them in PHP, use the already existing tools.
I ran the command in the cmd prompt and it worked. However, when I tried it in php it returned null. Do you know why? also how will this be returned in php once it works.

Re: Slow search code

Posted: Fri Jul 10, 2009 5:08 am
by VladSun
Linux or Windows?

Re: Slow search code

Posted: Fri Jul 10, 2009 5:16 am
by jvnane
Windows

Re: Slow search code

Posted: Fri Jul 10, 2009 5:41 am
by VladSun

Code: Select all

<?php
 
$path = 'C:\\php';
 
$files = shell_exec('dir /S /A:A /O:N /B '.$path);
var_dump($files);
 
$files = explode("\n", $files);
var_dump($files);
PS: It is Windows OS on the *SERVER*, right?

Re: Slow search code

Posted: Fri Jul 10, 2009 6:28 am
by jvnane
VladSun wrote:PS: It is Windows OS on the *SERVER*, right?
The server is running off the machine I am working on, just for testing purposes, and yes it is a windows machine. Also I tried the command you gave me in the cmd prompt and it worked perfectly. The path of each file was listed on a new line, but the php code was still returning null. This time it was a null string array instead of just a null variable type.

Re: Slow search code

Posted: Fri Jul 10, 2009 6:37 am
by VladSun
The last code snippet I've posted works fine for me ...
Did you try it?

Re: Slow search code

Posted: Fri Jul 10, 2009 6:59 am
by jvnane

Code: Select all

<?php
 
$files = shell_exec('dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET');
var_dump($files);
 
$files = explode("\n", $files);
var_dump($files);
 
That's exactly what I put in. If I type this dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET into the command prompt that works fine... just not the php code. The path is for a database by the way.

Re: Slow search code

Posted: Fri Jul 10, 2009 7:21 am
by VladSun

Code: Select all

echo 'dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET';
;)

Re: Slow search code

Posted: Fri Jul 10, 2009 7:30 am
by jvnane
VladSun wrote:

Code: Select all

echo 'dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET';
;)
oooh sneaky... didn't notice that... changed it to \\\Odcsrv\share1\ODC\FMS_FMF_IMET and now it works

Re: Slow search code

Posted: Fri Jul 10, 2009 7:48 am
by VladSun
Is it faster than your PHP tool?