Slow search code
Moderator: General Moderators
Slow search code
I'm trying to make this search function on this intranet site we have at work. It's supposed to search all the file names in our databases for whatever the user puts in. The code I have created works, but it takes forever to fully execute... I haven't seen it finish when searching on the actual database, but i think it would take at least 10 minutes. Any suggestions on speeding this up would be much appreciated. Also I am pretty new to php, just started using it a few weeks ago, so I might be unfamiliar with a lot of terms etc.
<?php
class Search
{
var $original_path;
var $original_dir;
var $match_files = array();
var $dirs = array();
var $files = array();
var $file_search;
function start($file_search,$path)
{
$this->file_search = $file_search; //string that the user put in to search for
$this->original_path = $path; //directory to start searching in (string)
$this->original_dir = opendir($this->original_path);
$paths = $this->getNames($this->original_dir,$this->original_path);
$this->scanPaths($this->original_dir,$paths);
for($x=0;$x<count($this->files);$x++)
{
$this->doesItMatch($file_search,$this->files[$x]);
}
return $this->match_files;
}
///////////////////////////////////
//This function gets the path names of all files and folders in the current directory
//then stores the names to an array.
///////////////////////////////////
function getNames($dir,$path)
{
$paths = array();
for($file = readdir($dir); $file != false; $file = readdir($dir))
{
if(!($this->isItDot($file)))
{
$paths[count($paths)] = $path."/".$file;
}
}
return $paths;
}
///////////////////////////////////
//For every file/folder in the current directory, evaluate it. If it is a directory,
//open it up and get the names again and keep evaluating.
///////////////////////////////////
function scanPaths($dir,$paths)
{
for($x = 0; $x < count($paths); $x++)
{
$it_is_dir = $this->isItDir($paths[$x]);
if($it_is_dir)
{
$nextfolder = $paths[$x].$file;
$nextdir = opendir($nextfolder);//$this->dirs[count($this->dirs)] = opendir($nextpath);
$nextpaths = $this->getNames($nextdir,$nextfolder);
$this->scanPaths($nextdir,$nextpaths);
}
elseif(!$it_is_dir)
{
$this->files[count($this->files)] = $paths[$x];
echo "$file";
}
}
}
///////////////////////////////////
//Is $file_search in the file name? if so store it to the matched files array.
///////////////////////////////////
function doesItMatch($file_search,$path)
{
$pos = strrpos($path,"/");
$file = substr($path,$pos);
if(eregi($file_search,$file))
{
$this->match_files[count($this->match_files)] = $path;
}
}
function isItDir($path)
{
if(filetype($path) == dir)
{
return true;
}
else {return false;}
}
function isItDot($file)
{
if(strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0)
{return true;}
else {return false;}
}
}
?>
<?php
class Search
{
var $original_path;
var $original_dir;
var $match_files = array();
var $dirs = array();
var $files = array();
var $file_search;
function start($file_search,$path)
{
$this->file_search = $file_search; //string that the user put in to search for
$this->original_path = $path; //directory to start searching in (string)
$this->original_dir = opendir($this->original_path);
$paths = $this->getNames($this->original_dir,$this->original_path);
$this->scanPaths($this->original_dir,$paths);
for($x=0;$x<count($this->files);$x++)
{
$this->doesItMatch($file_search,$this->files[$x]);
}
return $this->match_files;
}
///////////////////////////////////
//This function gets the path names of all files and folders in the current directory
//then stores the names to an array.
///////////////////////////////////
function getNames($dir,$path)
{
$paths = array();
for($file = readdir($dir); $file != false; $file = readdir($dir))
{
if(!($this->isItDot($file)))
{
$paths[count($paths)] = $path."/".$file;
}
}
return $paths;
}
///////////////////////////////////
//For every file/folder in the current directory, evaluate it. If it is a directory,
//open it up and get the names again and keep evaluating.
///////////////////////////////////
function scanPaths($dir,$paths)
{
for($x = 0; $x < count($paths); $x++)
{
$it_is_dir = $this->isItDir($paths[$x]);
if($it_is_dir)
{
$nextfolder = $paths[$x].$file;
$nextdir = opendir($nextfolder);//$this->dirs[count($this->dirs)] = opendir($nextpath);
$nextpaths = $this->getNames($nextdir,$nextfolder);
$this->scanPaths($nextdir,$nextpaths);
}
elseif(!$it_is_dir)
{
$this->files[count($this->files)] = $paths[$x];
echo "$file";
}
}
}
///////////////////////////////////
//Is $file_search in the file name? if so store it to the matched files array.
///////////////////////////////////
function doesItMatch($file_search,$path)
{
$pos = strrpos($path,"/");
$file = substr($path,$pos);
if(eregi($file_search,$file))
{
$this->match_files[count($this->match_files)] = $path;
}
}
function isItDir($path)
{
if(filetype($path) == dir)
{
return true;
}
else {return false;}
}
function isItDot($file)
{
if(strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0)
{return true;}
else {return false;}
}
}
?>
Re: Slow search code
First, you might be interested with some issues when iterating recursively directories:
viewtopic.php?f=1&t=73749
You may try to make a system call to the OS -
Linux
Windows
then search $files for matches.
There is also another option - scan your dirs by using a cron job and insert/update the results into a database. This way you will have very fast DB search, although it is not a real time one - i.e. there might be files delete/added after the last cron job.
You may also use a flat file to store the scan results.
PS:
You can simplify some of your functions and make them more readable like this:
viewtopic.php?f=1&t=73749
You may try to make a system call to the OS -
Linux
Code: Select all
$matches = shell_exec('ls -R '.escapeshellcmd($path).' | grep -i "'.escapeshellcmd($file_search).'"')Code: Select all
$files = shell_exec('dir /S '.escapeshellcmd($path))There is also another option - scan your dirs by using a cron job and insert/update the results into a database. This way you will have very fast DB search, although it is not a real time one - i.e. there might be files delete/added after the last cron job.
You may also use a flat file to store the scan results.
PS:
You can simplify some of your functions and make them more readable like this:
Code: Select all
function isItDir($path)
{
return (filetype($path) == dir);
}
function isItDot($file)
{
return (strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0);
}
Last edited by VladSun on Fri Jul 10, 2009 4:32 am, edited 2 times in total.
There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
If it is Linux OS, you can use the inotify filesystem feature:
http://www.ibm.com/developerworks/linux ... otify.html
to update your DB in realtime.
http://www.ibm.com/developerworks/linux ... otify.html
to update your DB in realtime.
There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
Thanks for the tips, however I don't know what a cron job is... Would this be like making a database table with all the file and path names that could be searched for, and updating this table every so often?VladSun wrote:First, you might be interested with some issues hwen iterateing recursively directories:
viewtopic.php?f=1&t=73749
You may try to make a system call to the OS -
LinuxWindowsCode: Select all
$matches = shell_exec('ls -R '.escapeshellcmd($path).' | grep -i "'.escapeshellcmd($file_search).'"')then search $files for matches.Code: Select all
$files = shell_exec('dir /S '.escapeshellcmd($path))
There is also another option - scan your dirs by using a cron job and insert/update the results into a database. This way you will have very fast DB search, although it is not a real time one - i.e. there might be files delete/added after the last cron job.
You may also use a flat file to store the scan results.
PS:
You can simplify some of your functions and make them more readable like this:Code: Select all
function isItDir($path) { return (filetype($path) == dir); } function isItDot($file) { return (strcasecmp($file,".") == 0 || strcasecmp($file,"..") == 0); }
Also what does it mean to make a system call to the OS? How would this help?
Re: Slow search code
Cron job (Linux) is a scheduled task (windows).jvnane wrote:however I don't know what a cron job is... Would this be like making a database table with all the file and path names that could be searched for, and updating this table every so often?
Try running the commands I gave you in the command prompt and you will see what I mean - instead of implementing them in PHP, use the already existing tools.jvnane wrote:Also what does it mean to make a system call to the OS? How would this help?
There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
I ran the command in the cmd prompt and it worked. However, when I tried it in php it returned null. Do you know why? also how will this be returned in php once it works.VladSun wrote:Try running the commands I gave you in the command prompt and you will see what I mean - instead of implementing them in PHP, use the already existing tools.
Re: Slow search code
Linux or Windows?
There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
Code: Select all
<?php
$path = 'C:\\php';
$files = shell_exec('dir /S /A:A /O:N /B '.$path);
var_dump($files);
$files = explode("\n", $files);
var_dump($files);There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
The server is running off the machine I am working on, just for testing purposes, and yes it is a windows machine. Also I tried the command you gave me in the cmd prompt and it worked perfectly. The path of each file was listed on a new line, but the php code was still returning null. This time it was a null string array instead of just a null variable type.VladSun wrote:PS: It is Windows OS on the *SERVER*, right?
Re: Slow search code
The last code snippet I've posted works fine for me ...
Did you try it?
Did you try it?
There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
Code: Select all
<?php
$files = shell_exec('dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET');
var_dump($files);
$files = explode("\n", $files);
var_dump($files);
Re: Slow search code
Code: Select all
echo 'dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET';There are 10 types of people in this world, those who understand binary and those who don't
Re: Slow search code
oooh sneaky... didn't notice that... changed it to \\\Odcsrv\share1\ODC\FMS_FMF_IMET and now it worksVladSun wrote:Code: Select all
echo 'dir /S /A:A /O:N /B \\Odcsrv\share1\ODC\FMS_FMF_IMET';
Re: Slow search code
Is it faster than your PHP tool?
There are 10 types of people in this world, those who understand binary and those who don't