Page 1 of 1

Indexing Sites

Posted: Sun Oct 26, 2003 9:52 pm
by Mr. Tech
Hi!

I am wanting to index sites that submit their site. This is my code:

Code: Select all

<?php
$url = "http://www.google.com";
$fp = @fopen($url, "r" ); // Open the URL
      $content = "";
      if( $fp )
      {
        // Read in the URL
        while( !feof( $fp ) )
          $content .= fread( $fp, 10240 );        
        fclose( $fp );

        $fields = array( );
       
        
        // All content
        $fields[content] = $content;

        $trans = get_html_translation_table( HTML_ENTITIES );
        $trans = array_flip( $trans );
        
        $fields_remove = array( "script", "style" ); // Tags to remove from content

        while( list( $key, $val ) = each( $fields ) )
        {
          $fields[$key] = str_replace( ">", "> ", $fields[$key] );

          for( $k = 0; $k < count( $fields_remove ); $k++ )
            $fields[$key] = preg_replace( "/<{$fields_remove[$k]}.*?<\/{$fields_remove[$k]}>/is", "", $fields[$key] );
            
          $fields[$key] = strip_tags( $fields[$key] ); // Strip out tags
          
          $fields[$key] = strtr( $fields[$key], $trans );

          $fields[$key] = preg_replace( "/[\s ]+/", " ", $fields[$key] ); // Delete extra whitespace

          $fields[$key] = addslashes( $fields[$key] );
        }

     }
?>
Is that the quickest way to do it or is there a better way?

Thanks,

Ben