Googlebot detection

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
JayVee
Forum Newbie
Posts: 4
Joined: Thu Sep 18, 2008 5:19 am

Googlebot detection

Post by JayVee »

I'm having trouble getting pages indexed in google because I have restricted access to the webpages based on a cookie value. I needed to do this because it's an alcohol related website and needs to validate the user age before displaying a page. All my cookie and age validation code works fine but when I try to detect googlebot and allow it to view a page using an 'include' file the pages fail to index in google.
Heres my include code

Code: Select all

 
<?php   
    
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) == true ){  // condition (1)
    //User has the GoogleBot user agent, but is it a real google bot?
    $host = gethostbyaddr($_SERVER['REMOTE_ADDR']);
    if ( substr($host, (strlen($host)-13)) == 'googlebot.com' ){  // condition (2)
        //real bot
    } 
    }
    else if (!isset($_COOKIE['legal'])) {     //condition (3)
        header("Location: /index.php");   
    }
    
    else 
    if($_COOKIE['legal'] == "no"){ //condition (4)
        header("Location: /index.php");   
    }
    
 
?>
 
Any ideas why?
User avatar
andyhoneycutt
Forum Contributor
Posts: 468
Joined: Wed Aug 27, 2008 10:02 am
Location: Idaho Falls

Re: Googlebot detection

Post by andyhoneycutt »

Modifying your code slightly (to get it to run in the console on my system), I found that your code checks out cleanly for me running the following:

Code: Select all

$host = gethostbyaddr("66.249.66.1");
if ( substr($host, (strlen($host)-13)) == 'googlebot.com' )
{
  echo "It's a googlebot!\n";
}
I can confirm that your condition 2 statement is proper, and I don't see your first condition failing. Any value returned by strstr() that is not a null/negative value will evaluate to "== true". Can you verify in your logs that the googlebot is actually accessing the page in question? You may see the googlebot being sent away from your site by a robots.txt file before it can even access this page. I'm not sure, just throwing a couple of possibilities your way.

This resource may help you: Verifying Googlebot

-Andy
JayVee
Forum Newbie
Posts: 4
Joined: Thu Sep 18, 2008 5:19 am

Re: Googlebot detection

Post by JayVee »

Thanks for the response. I'm relatively new to this "user agent/googlebot detection" but I have uploaded your correction to my code to see if it works. One article I read said this. (not sure how this affects my code. Do I need to validate a series of IP numbers.)

http://googlewebmastercentral.blogspot. ... lebot.html

I'm using google webmaster tools and a sitemap.xml file to request indexing of my site. My urls are being submitted but nothing is indexing.

If i was to change my code to be this

Code: Select all

 
<?php   
    
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== FALSE ){  // condition (1)
    //User has the GoogleBot user agent and I don't want to verify
    
    }
    else if (!isset($_COOKIE['legal'])) {     //condition (2)
        header("Location: /index.php");   
    }   
    else 
    if($_COOKIE['legal'] == "no"){ //condition (3)
        header("Location: /index.php");   
    }
?>
 
Would this be more likely to allow googlebot access?
Any help here would be appreciated?
User avatar
andyhoneycutt
Forum Contributor
Posts: 468
Joined: Wed Aug 27, 2008 10:02 am
Location: Idaho Falls

Re: Googlebot detection

Post by andyhoneycutt »

In fact, this:

Code: Select all

...
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== FALSE ){  // condition (1)
...
Can simply be rewritten as:

Code: Select all

...
if( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot") ) {
...
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Googlebot detection

Post by VladSun »

andyhoneycutt wrote:In fact, this:

Code: Select all

...
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== FALSE ){  // condition (1)
...
Can simply be rewritten as:

Code: Select all

...
if( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot") ) {
...
[wrong]No, it can't. Read the manual.
Also, read the manual about PHP operators - there is a big difference between == and === operators.[/wrong]
Last edited by VladSun on Thu Nov 20, 2008 2:25 pm, edited 2 times in total.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
andyhoneycutt
Forum Contributor
Posts: 468
Joined: Wed Aug 27, 2008 10:02 am
Location: Idaho Falls

Re: Googlebot detection

Post by andyhoneycutt »

VladSun wrote:
andyhoneycutt wrote:In fact, this:

Code: Select all

...
if ( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== FALSE ){  // condition (1)
...
Can simply be rewritten as:

Code: Select all

...
if( strstr($_SERVER['HTTP_USER_AGENT'], "Googlebot") ) {
...
No, it can't. Read the manual.
Also, read the manual about PHP operators - there is a big difference between == and === operators.
Correct, VladSun. My mistake for not clarifying: in this case it makes no difference to the outcome of his code to use !== or just checking for any value at all. I do not mean to imply that !== is the same as != (in the same way that == is not the same as ===).

-Andy
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Googlebot detection

Post by VladSun »

andyhoneycutt wrote:Correct, VladSun. My mistake for not clarifying: in this case it makes no difference to the outcome of his code to use !== or just checking for any value at all. I do not mean to imply that !== is the same as != (in the same way that == is not the same as ===).
Ooops, sorry!
I thought it was substr() ...
There are 10 types of people in this world, those who understand binary and those who don't
Post Reply