Needing to clean a string/array of BAD Characters.

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
jclarkkent2003
Forum Contributor
Posts: 123
Joined: Sat Dec 04, 2004 9:14 pm

Needing to clean a string/array of BAD Characters.

Post by jclarkkent2003 »

How can I clean a string and/or an array of any invalid characters?

[quote]å
alix
Forum Commoner
Posts: 42
Joined: Thu Nov 18, 2004 8:41 am

Post by alix »

Take a look at this function. str_replace
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Post by josh »

It's probably easier to write a whitelist, since it looks like you're counting a lot of stuff as "Bad", the japaneese character set alone has thousands of characters in it which you might consider "bad"
jclarkkent2003
Forum Contributor
Posts: 123
Joined: Sat Dec 04, 2004 9:14 pm

Post by jclarkkent2003 »

alix wrote:Take a look at this function. str_replace
LoL, is all I'm gonna say, not gonna go deeper than that.
jshpro2 wrote:It's probably easier to write a whitelist, since it looks like you're counting a lot of stuff as "Bad", the japaneese character set alone has thousands of characters in it which you might consider "bad"
K, yea whitelist was what I did Not want to do specifically, kinda , but I guess I can do it as a Last resort.

Is there a CHART or something for BAD characters in the enligsh, like invalids....

Here i'm going into uncharted waters so I have no clue what this stuff is called.

There's like 255 characters in the ASCII markup ( dunno if that's what it's called ), but how would I check AGAINST that, and block everything but [a-zA-Z0-9_- AND ALL PUNCTUATION on the keyboard ?]

I want to allow ANY and ALL keyboard input, but only the characters that you see on the keyboard, none of those alt + 4 digit characters ( but is there a chart of those as well, that would be cool ).
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

There's no chart needed. All characters above decimal value 127 are ANSI characters. All at and below are ASCII characters. You'll need a function to use ord() on each character and filter the string.
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Post by josh »

If you want a-zA-Z and !@#$%^&*()-_+=/?., etc.. it might be easier to check the ascii value like feyd said, but anyways what you described IS a whitelist

Code: Select all

if (preg_match('[^a-zA-Z0-9]',$str)) {
echo 'string is not alpha-numeric';
}
Post Reply