UTF8 - Foreign characters not matching

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

User avatar
Apollo
Forum Regular
Posts: 794
Joined: Wed Apr 30, 2008 2:34 am

Re: UTF8 - Foreign characters not matching

Post by Apollo »

Oh, my example was just meant to detect obvious crap, so you could not insert that to your database at all, rather than filtering anything. Preg_match only returns a number of matches, which you can interpret here as true or false (i.e. whether the input string matched the pattern, or not).

You could apply it like this:

Code: Select all

$input = $_POST[input];
if (get_magic_quotes_gpc()) $input = stripslashes($input); // just in case you're on an old server, probably not necessary (but won't hurt either)
if (preg_match('/((select|delete).+from|update.+set|(alter|truncate|drop).+table|<[a-z])/i',$input)
{
  print("Sorry, your input seems invalid so I won't store it.");
}
else
{
  $input = mysql_real_escape_string($input);
  // now use $input in any SQL query
  print("Sanitized: $input");
}
Note that the "Sanitized: (...)" may still look wrong (or not similar to what it should be inside the query string) if it contains html entities, such as & or " etc.
This is just because you're printing $input as HTML here, which you wouldn't do if you're merely using it inside SQL queries. To output (as HTML) any string that comes from user input directly (e.g. from a form) or indirectly (retrieved from database, e.g. previously stored form input), use htmlspecialchars.

So just for debugging purposes, you could change that last line to:

Code: Select all

print("Sanitized: ".htmlspecialchars($input));
rhecker
Forum Contributor
Posts: 178
Joined: Fri Jul 11, 2008 5:49 pm

Re: UTF8 - Foreign characters not matching

Post by rhecker »

Thanks, and thanks to Pytrin for bringing up strip_tags a couple of posts ago. I will use that.
Post Reply