use iptables to prevent webscraping?
Posted: Thu Jun 18, 2015 11:24 am
Hello,
I am just starting out with a php-based site which will have a (hopefully!) have a large database of professionals. The site is hosted on ipage. I want to make sure that no one would be able to web scrape the info in my database (by accessing all indexed php pages, index.php?id=1,2,3,4 etc.). I do want search engines (e.g. Google) to be able to access/index my site. Does anyone have any good suggestions as to how to accomplish this? I see that iptables can limit connections to n/minute, but I don't see how to use this for a site hosted on ipage, nor how to allow the big search engines to get through without problem.
Thanks for any help,
Dan
I am just starting out with a php-based site which will have a (hopefully!) have a large database of professionals. The site is hosted on ipage. I want to make sure that no one would be able to web scrape the info in my database (by accessing all indexed php pages, index.php?id=1,2,3,4 etc.). I do want search engines (e.g. Google) to be able to access/index my site. Does anyone have any good suggestions as to how to accomplish this? I see that iptables can limit connections to n/minute, but I don't see how to use this for a site hosted on ipage, nor how to allow the big search engines to get through without problem.
Thanks for any help,
Dan