Page 1 of 1

Using HTACCESS to block variable IP bots...

Posted: Thu Feb 12, 2015 7:33 am
by Wolf_22
I'm trying to cut down on spammers who keep making trashy requests to my site using different IPs per-each request. The basic access log entry pattern that I'm seeing from these requests are as follows:
<IP ADDRESS> - - [<DATE / TIMESTAMP>] "GET /?q=node/add HTTP/1.1" 403 5507 "<WEBSITE>" "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
Nine times out of ten, these requests consistently use the above flavor of Webkit / Safari user agent and they almost always use a different IP address, thereby making it difficult to fix.

What I tried to do is as follows:
RewriteCond %{HTTP_COOKIE} !cookievar
RewriteCond %{REQUEST_FILENAME} \.(gif|jpe?g|png|js|css|swf|php|ico|txt|pdf|xml)$ [NC]
RewriteRule .* - [L,co=cookievar:true:%{HTTP:Host}:86400]
RewriteCond %{HTTP_COOKIE} !cookievar
RewriteCond %{THE_REQUEST} (user\/register|node\/add)
RewriteRule .* - [F]
I'm not very great with HTACCESS code (as you may or may not tell from the above) but my intentions here were to force any browser coming to the site to store a cookie value if they can access my site assets, then I would use that cookie to validate if the visitor is an actual user. If they pass that, I let them through and onto the website. Otherwise, I stop them before they can use any server resources. It's my understanding that blocking a user at the HTACCESS level is akin to stopping them at the app server level (and not the app itself). So my virtue here would be the elimination of leeching CPU / RAM from the server, etc. and also stopping spammers.

Unfortunately, my logs indicate that it's not working like I was hoping it would. This is either because the code above doesn't work or else because the browser requests are automated and legit browser visits that store cookies. I'm hoping that someone on here might have some suggestions or ideas about some of this? What I'd love to do is block all requests that can't store my cookie and make GET requests to the relative locations user/register or node/add completely inaccessible unless they have that cookie. This won't block people who might automate their browsers, but that I would attack later on.

Insights would be appreciated.

Re: Using HTACCESS to block variable IP bots...

Posted: Thu Feb 12, 2015 1:46 pm
by requinix
It should be easy to check if they're hitting other pages/resources: look for other requests from the same IP address around that time.