I am wanting to index sites that submit their site. This is my code:
Code: Select all
<?php
$url = "http://www.google.com";
$fp = @fopen($url, "r" ); // Open the URL
$content = "";
if( $fp )
{
// Read in the URL
while( !feof( $fp ) )
$content .= fread( $fp, 10240 );
fclose( $fp );
$fields = array( );
// All content
$fields[content] = $content;
$trans = get_html_translation_table( HTML_ENTITIES );
$trans = array_flip( $trans );
$fields_remove = array( "script", "style" ); // Tags to remove from content
while( list( $key, $val ) = each( $fields ) )
{
$fields[$key] = str_replace( ">", "> ", $fields[$key] );
for( $k = 0; $k < count( $fields_remove ); $k++ )
$fields[$key] = preg_replace( "/<{$fields_remove[$k]}.*?<\/{$fields_remove[$k]}>/is", "", $fields[$key] );
$fields[$key] = strip_tags( $fields[$key] ); // Strip out tags
$fields[$key] = strtr( $fields[$key], $trans );
$fields[$key] = preg_replace( "/[\s ]+/", " ", $fields[$key] ); // Delete extra whitespace
$fields[$key] = addslashes( $fields[$key] );
}
}
?>Thanks,
Ben