utf8 secure registration form help
Posted: Tue Apr 13, 2010 5:07 am
Hi
I thought I would dive str8 into learning PHP by creating a registration form that is secure. I have come to the concusion it's confusing and I could do with some help and advice.
1) is this code OK to use in the form or is it vunrable?
2) Never trust users! - So when the form is submitted I follow this process:
3) I am also worried about multibite vunrabilities so I was going to run the UTF8 fields through the following:
4) As a last thing I was going to run the data through mysql_real_escape_string(trim($variable)); before entering the data into my DB using a preprepaired statment.
Or am I going over the top and doing things than don't need to be done for user data validation. My form also uses a form_token and capcha and will validate the MX record of the email domain supplied.
Kind Regards
Stephen
I thought I would dive str8 into learning PHP by creating a registration form that is secure. I have come to the concusion it's confusing and I could do with some help and advice.
1) is this code OK to use in the form or is it vunrable?
Code: Select all
<input name="UserName" type="text" size="12" value="<?php if(isset($_POST['UserName'])){ echo $_POST['UserName']; } ?>">Code: Select all
// trim {input Data}
foreach ($_POST as $key => $value) { $_POST[$key] = trim($value); }
// Strip Tags
foreach ($_POST as $key => $value) { $_POST[$key] = strip_tags($value); }
// encode Data htmlentities
foreach ($_POST as $key => $value) { $_POST[$key] = htmlentities($value, ENT_QUOTES,"UTF-8" ); }
//correct case, not sure if strtolower,ucwords works with UTF8
$_POST["UserName"] = strtolower($_POST["UserName"]);
$_POST["FirstName"] = ucwords(strtolower($_POST["FirstName"]));
$_POST["LastName"] = ucwords(strtolower($_POST["LastName"]));
$_POST["Email"] = strtolower($_POST["Email"]);
// test FirstName ,NO number, between 3 and 30 chars, encoded for other language, first letter cap others lower
// THIS IS UTF-8
if (!filter_has_var(INPUT_POST, 'FirstName')){ $msg = "Please fill ALL the fields in the Registration Form - FirstName"; }
if (mb_strlen( $_POST["FirstName"]) > 30 || mb_strlen($_POST["FirstName"]) < 3) { $msg = "Opps..It looks like your First name is too long for our system."; }
if (filter_var($_POST["FirstName"], FILTER_VALIDATE_REGEXP,array("options"=>array("regexp"=>"/[0-9<>-_`¬@!£$%^]/")))){ $msg = "Your FirstName must only use Letters UTF-8 if fine."; }
// This is to encode HTML... NOT NEEDED
// $_POST["FirstName"] = filter_input(INPUT_POST, "FirstName" , FILTER_SANITIZE_SPECIAL_CHARS);
Code: Select all
$convert = array();
setlocale(LC_CTYPE, 'en_US.UTF-8');
foreach( $strings as $string )
$convert[] = iconv('UTF-8', 'UTF-8//IGNORE', $string);
/*
In the bellow algorithm the first preg_replace() only allows well formed Unicode
(and rejects overly long 2 byte sequences, as well as characters above U+10000).
http://webcollab.sourceforge.net/unicode.html
U-00000000 â U-0000007F: 0xxxxxxx
U-00000080 â U-000007FF: 110xxxxx 10xxxxxx
U-00000800 â U-0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
U-00010000 â U-001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
U-00200000 â U-03FFFFFF: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
U-04000000 â U-7FFFFFFF: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
*/
$strings = preg_replace('/[\x00-\x08\x10\x0B\x0C\x0E-\x19\x7F]'.
'|[\x00-\x7F][\x80-\xBF]+'.
'|([\xC0\xC1]|[\xF0-\xFF])[\x80-\xBF]*'.
'|[\xC2-\xDF]((?![\x80-\xBF])|[\x80-\xBF]{2,})'.
'|[\xE0-\xEF](([\x80-\xBF](?![\x80-\xBF]))|(?![\x80-\xBF]{2})|[\x80-\xBF]{3,})/S',
'?', $strings );
//The second preg_replace() removes overly long 3 byte sequences and UTF-16 surrogates.
$strings = preg_replace('/\xE0[\x80-\x9F][\x80-\xBF]'.
'|\xED[\xA0-\xBF][\x80-\xBF]/S',
'?', $strings );
Or am I going over the top and doing things than don't need to be done for user data validation. My form also uses a form_token and capcha and will validate the MX record of the email domain supplied.
Kind Regards
Stephen