Code: Select all
/^[a-zA-Z0-9\s\;\.\-\,\']*$/uModerator: General Moderators
Code: Select all
/^[a-zA-Z0-9\s\;\.\-\,\']*$/uGood point! I don't understand how to define a range that will encompass Japanese, Chinese, Latin and Vietnamese?...Japanese is not in the A-Z range.
On my forms, which are course application forms, applicants may enter Chinese, Vietnamese or Latin characters in the name field, for instance. The forms themselves are in muliple languages.Exactly what are you trying to accomplish?
Yes, but I'm sure this is a problem many other developers have faced.There's a good reason you didn't find a regex or function like this out of the box
Sure, but due to the nature of all these ambiguous language differences and exceptions, by definition there simply won't be a satisfactory, one-size-fits-all solution for this.rhecker wrote:Yes, but I'm sure this is a problem many other developers have faced.
How would you like to distinguish English and Vietnamese from, for example, Italian? Or Welsh (Cymraeg), or Afrikaans?Fact is, I have to build a form that allows for Chinese, Vietnamese and English input, and I have to filter it.
You'll probably be able to find out what are the character (codepoint) ranges for characters from these languages on unicode.org. You can make a regexp allowing only characters from those ranges, plus some basic interpunction symbols, but you will:I don't have any idea how to write a regex that allows for chinese and vietnamese.
Well perhaps you could use something very simplistic likerhecker wrote:I'd rather that those bad records never even got into the database.
Code: Select all
preg_match('/((select|delete).+from|update.+set|(alter|truncate|drop).+table|<[a-z])/i',$input)Code: Select all
<form action="<?php $_SERVER['PHP_SELF'] ?>" method="post">
input: <input type="text" name="input" size="50"><br/>
<input value="submit" name="submit" type="submit"/>
</form>
<?php if ($_POST[submit]) {
$input = $_POST[input];
$input2 = preg_match('/((select|delete).+from|update.+set|(alter|truncate|drop).+table|<[a-z])/i', $input);
$input3 = preg_match('/^[a-zA-Z0-9\s\;\.\-\,\']*$/u', $input);
echo "Unfiltered: $input <br/>";
echo "Apollo's script: $input2 <br/>";
echo "Original script: $input3 <br/>";
}?>