I was wondering if anyone out there had experience in detecting search engines crawlers and bots.
We have a site where we charge our members per-click, but we don't want to charge them when its just a crawler or bot accessing a link.
Keeping a good list of IP addresses to check against would require a lot of maintenance.
I would track how the bot is crawling (links-clicked-per-minuet or something) then after a threshold is reached, it would credit back links that had already been clicked and deducted.
However, that would assume the bot is sending the session ID but I doubt if many do.
I wonder what the pay-per-click ad companies do?
Any ideas would be greatly appreciated! Thanks!!!
Detecting search engine bots and cralwers
Moderator: General Moderators
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
While it is easy (for more technically minded) to spoof, you could use the results of get_browser()
edit: Also you can use gethostbyname()... but that can be spoofed too.. although a bit more technically challenging to many. It's certainly a higher order fruit than using the user-agent.
edit: Also you can use gethostbyname()... but that can be spoofed too.. although a bit more technically challenging to many. It's certainly a higher order fruit than using the user-agent.