Google Bot

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Google Bot

Post by agriz »

Hi,

I tried to trace google bot.
Googlebot-Image/1.0 is visiting my site and always the visiting url is 404.php

What could be the reason?

I used php's request_uri to know in which page google is...
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: Google Bot

Post by Eric! »

google bot image is probably looking for a jpg or something that has a bad link. Your host then redirects the request to 404.php as configured in your apache settings or via cPanel.

Check your server logs to see what caused the redirection or try checking $_SERVER['HTTP_REFERER'] in your 404.php file to see where all your broken link requests are coming from.
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: Google Bot

Post by agriz »

Perfect Idea to trace the error. I will use http referrer in 404 php file
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: Google Bot

Post by agriz »

The 404 is written in htaacess

ErrorDocument 404 http://www.justlinkexchange.com/404.php

I tried to receive email for 404 error.
http://www.justlinkexchange.com/gd

But i didn't get mail with the referrer page. :(

This is what i get
Error page :

"Error Page ".$_SERVER['HTTP_REFERER'];

This is the mail content
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: Google Bot

Post by agriz »

REDIRECT_URL works.
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: Google Bot

Post by Eric! »

Seriously? Your 404.php file sends you emails? That's going to suck when you get 1000 emails due to one bad link.

I would modify your 404.php file to just open a file, write the info into the file (or database) so you can trace it later.

For example, here is something primitive.

Code: Select all

//[normal stuff here like telling the user, sorry that page doesn't exist, go suck it]
//[then your debug stuff]
if(isset($_SERVER["REQUEST_URI"])) $requested=filter_var($_SERVER["REQUEST_URI"],FILTER_SANITIZE_URL);
else $requested="";
if(isset($_SERVER["REDIRECT_STATUS"])) $status=filter_var($_SERVER["REDIRECT_STATUS"],FILTER_SANITIZE_STRING);
else $status="";
if(isset($_SERVER["HTTP_REFERER"])) $referer=filter_var($_SERVER["HTTP_REFERER"],FILTER_SANITIZE_URL);
else $referer="";
if(isset($_SERVER['REMOTE_ADDR']))	$ip=substr($_SERVER['REMOTE_ADDR'],0,20);
else $ip="";

//create 404_errors.txt and set file permissions so php can write to it
$file=fopen("404_errors.txt","a");
if($file===FALSE)
{
    echo "can not open debug file"; // you don't have it configured right do you?
    exit();
}
$data="Requested page=".$requested.", Status Code=".$status.", Referring Site=".$referer.", IP=".$ip."\n");
fwrite($file,$data);
fclose($file);
Ideally you'd put this into a database, perhaps build a table of links to redirect the user to, perform a search based on the info extracted from the bad link they requested and offer then choices, etc....
User avatar
Jonah Bron
DevNet Master
Posts: 2764
Joined: Thu Mar 15, 2007 6:28 pm
Location: Redding, California

Re: Google Bot

Post by Jonah Bron »

Eric! wrote:

Code: Select all

//create 404_errors.txt and set file permissions so php can write to it
$file=fopen("404_errors.txt","a");
if($file===FALSE)
{
    echo "can not open debug file"; // you don't have it configured right do you?
    exit();
}
$data="Requested page=".$requested.", Status Code=".$status.", Referring Site=".$referer.", IP=".$ip."\n");
fwrite($file,$data);
fclose($file);
file_put_contents() anyone?

Code: Select all

$success = file_put_contents('404_errors.txt',
    sprintf(
        'Requested page=%s, Status Code=%d, Referring Site=%s, IP=%s',
        $requested,
        $status,
        $referer,
        $ip
    ),
    FILE_APPEND
);
if (!success)
    // Fatal error, contact admin by email
Post Reply