Page 2 of 2

Posted: Tue Jan 16, 2007 1:12 am
by matthijs
Using a blacklist like that you are going to loose the battle. You'll have a hard time updating that list fast enough and manually deleting messages that came through.

If I were you I would take a look at existing solutions. For example, the Wordpress system mentioned before has many anti-spam plugins.

One central service is Akismet. If you sign up (free) you get an API key, which you can use to use the service. There are several PHP classes which you can use in your scripts to use Akismet, it's very easy. The way Akismet works is that every message is checked against a huge database of filters and rules. That database is self learning and gets updated by the tens of thousands of people using Wordpress. When someone sees that a message gets through but is spam, he assigns it as spam and when enough people do that the Akismet rules are improved.

I have used it on a couple of sites now and the results are amazing. Maybe a few spamposts a month get through. No false-positives so far. From one of my sites:
Akismet has caught 9,432 spam for you since you first installed it.

You have no spam currently in the queue. Must be your lucky day. :)

Posted: Tue Jan 16, 2007 1:20 am
by Nodda4me
Updating what?? My script is automatic. Once it finds a message that is spam, it will add it to the ban list (SQL) with date, time, message, and IP. The message is not posted, it uses die(); right after the IP is added.

Posted: Tue Jan 16, 2007 1:33 am
by matthijs
As far as I know IP's aren't reliable. What if a legitimate user has an IP which has been previously used by a spammer?

Posted: Tue Jan 16, 2007 4:36 am
by Nodda4me
Ok, someone messaged me a tip about checking if the person/bot has javascript enabled. I used it and it didn't work at all.

I'm not banning anymore, but I did put a message telling the person to rewrite the message.

Code: Select all

$dex1 = substr_count("$ShoutInfo","http");
	$dex2 = substr_count("$ShoutInfo","www.");
	$dex3 = substr_count("$ShoutInfo","<a");
	$dex4 = substr_count("$ShoutInfo","<?");
	$dex5 = substr_count("$ShoutInfo","Nice site. Thank you");
	$dex6 = substr_count("$ShoutInfo","Cool site. Thank you");
	$dex7 = substr_count("$ShoutInfo","Thank");
	if ($dex1 > 2 || $dex2 > 2 || $dex3 > 1 || $dex4 > 0 || $dex5 > 0 || $dex6 > 0 || $dex7 > 0) {
		$Reason = "Unknown";
		if ($dex1 > 2 || $dex2 > 2) {
			$Reason = "URL Spammer";
		}
		if ($dex3 > 1) {
			$Reason = "URL Syntax";
		}
		if ($dex4 > 0) {
			$Reason = "PHP Syntax";
		}
		if ($dex5 > 0 || $dex6 > 0) {
			$Reason = "spammer or bot";
		}
		if ($dex7 > 0 && $dex1 > 0 && $dex3 > 0) {
			$Reason = "spammer or bot";
		}
		if ($Reason != "Unknown") {
			die("Your message contains spam content. Please rewrite your message.<br><br>$Reason");
		}
	}