Hi,
I'm trying to filter out spam on my forum. I added two tables to my db. One is spamwords and another called spamAlphabet.
The spamAlphabet contains records like:
a aàáâåãäæ\@
b b6
c cç6
d db
e eéèêë3
f f4
g qgp9
h h4
i iìíîï¡1\|l\!
j ji1
which I use to construct a regex expression using a stored procedure in Sql.
Using the above substitution, the word "tour" becomes
/[t7\+][\\W]*[o0óòôøõö(\(\))\*\.][\\W]*[\\W]*[r]/gi
Essentially, I would like to find any form or shape of the word "tour", even if there is punctuation or white space between the letters.
Is the specific resulting regex that I'm using in the correct syntax to trap these sort of occurances? If not, please write how I should change it.
Thanks in Advance
Joshua
forum post validation regex
Moderator: General Moderators
You could remove all invalid characters from the string and then validate against it. For example this function would remove all characters that are not letters or numbers. Not sure if this would work in your application.
Untested...
Untested...
Code: Select all
function LettersAndNumbersOnly($String) {
$WhiteList = array('0','1','2','3','4','5','6','7','8','9',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z');
$NewString = null;
$LengthOfString = strlen($String);
for($CharacterToCheck = 0; $CharacterToCheck < $LengthOfString; $CharacterToCheck++) {
if(in_array($String[$CharacterToCheck],$WhiteList)){
$NewString .= $String[$CharacterToCheck];
}
}
return $NewString;
}
// regex against the word or phrase here..