simple word filter for comment system

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

simple word filter for comment system

Post by Sindarin »

I am trying to implement a simple word filter for common bad words in my comment system, but it seems I can't block variation of it when the user uses uppercase and lowercase letters.
Like I have 'badword1' listed but the user can input 'BADWORD1' or 'bAdwOrd1' and bypass it. How can I make the below code case insensitive?

/* REPLACE BAD WORDS WITH ASTERISKS */

Code: Select all

function word_filter($str)
{
$bad_words=array( 
"badword1","badword2","badword3","badword4"
 );
 
$replacements=array( 
"**********"
 );
 
for($i=0;$i < sizeof($bad_words);$i++){
  srand((double)microtime()*1000000); 
  $rand_key = (rand()%sizeof($replacements));
  $str=eregi_replace($bad_words[$i], $replacements[$rand_key], $str);
 }
 return $str;
}
watson516
Forum Contributor
Posts: 198
Joined: Mon Mar 20, 2006 9:19 pm
Location: Hamilton, Ontario

Re: simple word filter for comment system

Post by watson516 »

How about converting all of the text to lowercase before you check?
User avatar
nor0101
Forum Commoner
Posts: 53
Joined: Thu Jan 15, 2009 12:06 pm
Location: Wisconsin

Re: simple word filter for comment system

Post by nor0101 »

You might also want to check out str_ireplace(). It's less computationally expensive than using a regex. See http://us3.php.net/manual/en/function.str-ireplace.php for details.
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

Re: simple word filter for comment system

Post by Sindarin »

You might also want to check out str_ireplace(). It's less computationally expensive than using a regex. See http://us3.php.net/manual/en/function.str-ireplace.php for details.
Good, this worked nicely. Thanks.

Code: Select all

for($i=0;$i < sizeof($bad_words);$i++){
  $str=str_ireplace($bad_words,$replacements[0],$str);
 }
 return $str;
}
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

Re: simple word filter for comment system

Post by Sindarin »

I just noticed, it doesn't work correctly for non-English characters. Why is that?
It can detect e.g. "Κακή" but not "Κακη", "ΚΑΚΗ" or variations like "κΑΚη".
mattpointblank
Forum Contributor
Posts: 304
Joined: Tue Dec 23, 2008 6:29 am

Re: simple word filter for comment system

Post by mattpointblank »

Because they're not the same characters..? They might have the same meaning linguistically, but their ascii/unicode/UTF-8 etc symbol will be different.
mickeyunderscore
Forum Contributor
Posts: 129
Joined: Sat Jan 31, 2009 9:00 am
Location: UK

Re: simple word filter for comment system

Post by mickeyunderscore »

Just thought I'd add a note on for loops. Your for loop at the moment:

Code: Select all

for($i=0;$i < sizeof($bad_words);$i++){
Sets $i to 0, then at the beginning of each iteration it counts your array of bad words and compares $i to it, at the end of each iteration increments $i.

It would be better for you to use:

Code: Select all

for($i=0, $size = sizeof($bad_words);$i < $size;$i++){
This way it only counts the array once, rather than once per iteration.
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

Re: simple word filter for comment system

Post by Sindarin »

Because they're not the same characters..? They might have the same meaning linguistically, but their ascii/unicode/UTF-8 etc symbol will be different.
So there is no way to check for those combinations easily?
mickeyunderscore
Forum Contributor
Posts: 129
Joined: Sat Jan 31, 2009 9:00 am
Location: UK

Re: simple word filter for comment system

Post by mickeyunderscore »

Sindarin wrote:So there is no way to check for those combinations easily?
Writing your own word filter would be very difficult, if you wanted it to be effective. I'd look into an open-source solution, there is bound to be some around.
Post Reply