Page 1 of 1

general directory crawling problems

Posted: Tue Apr 28, 2009 4:36 pm
by drumking88
I am trying to get a searcher working as part of a project and am a little confused. I need to be able to get the html content from all the files in the directory, sub director(ies) and dump it into a mysql database. I have no problems connecting to a database, but I think this code is causing problems at the start when doing the initial database dump.

Code: Select all

 
$dir = '.';
$files = scandir($dir);
$files1 = array_slice($files, 2);
 
$testArray = array(1 =>$files);
$delete = mysql_query("DELETE FROM spider");
foreach ($files1 as $page) {
$content = file_get_contents($page);
$scontent = strip_tags($content);
 
This works fine for some files when I dump the findings into a mysql database, but in other files, it just screws up incredibly.

any help would be much appreciated.

thanks