Page 1 of 1

Needing to clean a string/array of BAD Characters.

Posted: Fri Feb 17, 2006 8:28 am
by jclarkkent2003
How can I clean a string and/or an array of any invalid characters?

[quote]å

Posted: Fri Feb 17, 2006 11:54 am
by alix
Take a look at this function. str_replace

Posted: Fri Feb 17, 2006 11:59 am
by josh
It's probably easier to write a whitelist, since it looks like you're counting a lot of stuff as "Bad", the japaneese character set alone has thousands of characters in it which you might consider "bad"

Posted: Fri Feb 17, 2006 1:09 pm
by jclarkkent2003
alix wrote:Take a look at this function. str_replace
LoL, is all I'm gonna say, not gonna go deeper than that.
jshpro2 wrote:It's probably easier to write a whitelist, since it looks like you're counting a lot of stuff as "Bad", the japaneese character set alone has thousands of characters in it which you might consider "bad"
K, yea whitelist was what I did Not want to do specifically, kinda , but I guess I can do it as a Last resort.

Is there a CHART or something for BAD characters in the enligsh, like invalids....

Here i'm going into uncharted waters so I have no clue what this stuff is called.

There's like 255 characters in the ASCII markup ( dunno if that's what it's called ), but how would I check AGAINST that, and block everything but [a-zA-Z0-9_- AND ALL PUNCTUATION on the keyboard ?]

I want to allow ANY and ALL keyboard input, but only the characters that you see on the keyboard, none of those alt + 4 digit characters ( but is there a chart of those as well, that would be cool ).

Posted: Fri Feb 17, 2006 1:36 pm
by feyd
There's no chart needed. All characters above decimal value 127 are ANSI characters. All at and below are ASCII characters. You'll need a function to use ord() on each character and filter the string.

Posted: Fri Feb 17, 2006 2:56 pm
by josh
If you want a-zA-Z and !@#$%^&*()-_+=/?., etc.. it might be easier to check the ascii value like feyd said, but anyways what you described IS a whitelist

Code: Select all

if (preg_match('[^a-zA-Z0-9]',$str)) {
echo 'string is not alpha-numeric';
}