robots hacking and posting form data resulting in spam

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Post by timvw »

Maugrim_The_Reaper wrote: Generally, from filtering spam on my own blog without resorting to desperate measures like CAPTCHAs (unless post is a certain age) a few filters watching URL counts (how many URLs per comment), author and body terms, etc. works well.
Some spambots simply post like "Hey i like your site..." And then simply (ab)use the author-url.. Which might make sense since i don't expect search-engine crawlers to see a difference between an url in my "post content" and an url in the "author div".
Maugrim_The_Reaper wrote: So to does having some form of mechanism for forcing a delay between individual comments - spambot generally try posting dozens of comments per second if not more.
Some are smart enough to wait a while.. But they do come back, day after day (well, untill i redirect them to http://{$_SERVER['REMOTE_ADDR']} ;))
Maugrim_The_Reaper wrote: Relying on IPs is not going to be very reliable - a spammer can switch proxies as often as you ban IPs. Many will never even post from the same IP to the same site if they can help it.
In my experience they do re-use IPs from the same netblock. And too bad for open proxies, they're unwelcome ;)
User avatar
deeppak
Forum Commoner
Posts: 27
Joined: Thu Apr 06, 2006 6:31 am

at last detected crawler

Post by deeppak »

hi all at last i have picked this spammer the hostname of the spammer is as follows:
sv-crawlfw4.looksmart.com

now any one can now plz tell me how to stop him from crawling my site shall i disallow him in robots.txt and tell me some other method also how to stop him from spamming.

be quick plz
i am already messed up fighting this dreaded deamon the fight is still on and it wil continue till i stop this stupid crawler from spamming my site i thank all of u in contributing to this, i really appreciate all of you guaidance

Thanx in advance
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Re: at last detected crawler

Post by AKA Panama Jack »

deeppak wrote:hi all at last i have picked this spammer the hostname of the spammer is as follows:
sv-crawlfw4.looksmart.com

now any one can now plz tell me how to stop him from crawling my site shall i disallow him in robots.txt and tell me some other method also how to stop him from spamming.

be quick plz
i am already messed up fighting this dreaded deamon the fight is still on and it wil continue till i stop this stupid crawler from spamming my site i thank all of u in contributing to this, i really appreciate all of you guaidance

Thanx in advance
Actually most robot spammers ignore the robot.txt file.
User avatar
deeppak
Forum Commoner
Posts: 27
Joined: Thu Apr 06, 2006 6:31 am

hey then what is the solution

Post by deeppak »

comon is there not even a single way to stop him after getting his url even.


thanx in advance
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Create a list of banned hosts, check for them on every comment post, refuse to post comment...
Some spambots simply post like "Hey i like your site..." And then simply (ab)use the author-url.. Which might make sense since i don't expect search-engine crawlers to see a difference between an url in my "post content" and an url in the "author div".
So check the author url...it's already done on my own blog.
Some are smart enough to wait a while.. But they do come back, day after day
True, but the key word in both your quotes is "some". The majority don't care about such deliberate measures, they're out to get 1 in every 1000 or more comments actually onto a blog which either isn't fully filtered or misses their comment as being spam.
In my experience they do re-use IPs from the same netblock. And too bad for open proxies, they're unwelcome.
The problem here is the last time I blocked open proxies I got 6 emails inside a day from legitimate users who thought comments has been disabled or were broken. I think of it like blocking IPs - you end up alienating legitimate users. Which is also why I refuse to ever use CAPTCHAs on anything...fullstop - there is at least two people who are blind who read my blog that I know of.

Speaking from personal experience (and not even remotely saying it reflects anyone elses, or even reality for that matter ;)), spammers are unimaginative folk. They re-use the same or similar tactics and messages over and over again. Some inevitably find a way past filters, but usually its a simple matter to adapt filters, or at least get the most suspicious comments listed for review. I get maybe 200-1500 spam attempts during a week - last weekend saw a massive torrent of 850 for example - and only had 3 potential spams make it through. 2 were listed for review, and the 3rd appeared to be nothing but a "I like your blog." linking to Google of all places...

Do Google spam blogs? ;)
User avatar
deeppak
Forum Commoner
Posts: 27
Joined: Thu Apr 06, 2006 6:31 am

at last wanted to implement CAPTCHA

Post by deeppak »

I wanted to implemented CAPTCHA on my site now plz let me know from where to start how to check GD is installed on server or not and what i should look up to start from the very begining. let me know the best solution since i want to look professionalk like yahoo and all implementing CAPTCHA and keep in mind that i am newbie.

Be quick in answering becuase i have already waisted so much time fighting this spammer.

Cheers,
Deeppak Gupta
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Check PEAR - I believe it has a CAPTCHA class.
User avatar
aerodromoi
Forum Contributor
Posts: 230
Joined: Sun May 07, 2006 5:21 am

Re: at last wanted to implement CAPTCHA

Post by aerodromoi »

deeppak wrote:I wanted to implemented CAPTCHA on my site now plz let me know from where to start how to check GD is installed on server or not and what i should look up to start from the very begining. let me know the best solution since i want to look professionalk like yahoo and all implementing CAPTCHA and keep in mind that i am newbie.

Be quick in answering becuase i have already waisted so much time fighting this spammer.

Cheers,
Deeppak Gupta
In case you're talking about those infamous link lists, why don't you use

Code: Select all

substr_count(strtolower($string), strtolower("http://"));
to decide whether an entry needs review or not.

Joe Blog won't post more than two or three links in his entry, so anything above that should be spam.
It's not 100% foolproof, but combined with a time-out of two or three minutes and a blacklist of undesired words it helps
to keep spam at bay.

aerodromoi

ps: Hope you could settle your spam problem.
Post Reply