Below I show what I currently do. Some things I only do because someone told me to and I'm not really sure what its for.
So even if you could just tell me what needs commenting and what the comment should say I'd be grateful.
What do you people do before data insertion and before display? Care to share your code/comments/thought process?
Besides security, my other query is just what the heck to do about the strange characters people copy from Microsoft word (see FixMicrosoftWordPastes function below).
Before going into the database:
Code: Select all
$string = $_POST[$field];
$string = FixMicrosoftWordPastes($string);
if(mb_detect_encoding($string) == 'UTF-8')
{ $string = mb_convert_encoding($string, "ISO-8859-1"); }
$string = html_entity_decode($string);
$string = strip_tags($string);
$string = mysql_real_escape_string($string); Before being displayed: ... nothing apparently ... occasionally I use markdown on some projects but that's about it.
Code: Select all
// Here's the FixMicrosoftWordPastes function
function FixMicrosoftWordPastes($body)
{
$trans_tbl = array() ;
$trans_tbl[chr(11)] = "\n"; // some sort of new line character
//$trans_tbl[chr(34)] = '"' ; // quote
//$trans_tbl[chr(38)] = '&' ; // ampersand
// Don't search for ampersand by itself it this way.
// It complicates things when the text contains • — é etc....
$trans_tbl[chr(128)] = '€' ; // euro
$trans_tbl[chr(129)] = '€' ; // euro
$trans_tbl[chr(130)] = '‚' ; // low quote
$trans_tbl[chr(131)] = 'ƒ' ; // florin
$trans_tbl[chr(132)] = '„' ; // double low quote
$trans_tbl[chr(133)] = '…' ; // ellipsis
$trans_tbl[chr(134)] = '†' ; // dagger
$trans_tbl[chr(135)] = '‡' ; // double dagger
$trans_tbl[chr(136)] = 'ˆ' ; // circumflex
$trans_tbl[chr(137)] = '‰' ; // per thousand
$trans_tbl[chr(138)] = 'Š' ; // S caron
$trans_tbl[chr(139)] = '‹' ; // left angle quote
$trans_tbl[chr(140)] = 'Œ' ; // OE ligature
$trans_tbl[chr(142)] = 'Ž' ; // Z caron
$trans_tbl[chr(145)] = '‘' ; // left single quote
$trans_tbl[chr(146)] = '’' ; // right single quote
$trans_tbl[chr(147)] = '“' ; // left double quote
$trans_tbl[chr(148)] = '”' ; // right double quote
$trans_tbl[chr(149)] = '•' ; // bullet
$trans_tbl[chr(150)] = '–' ; // en dash
$trans_tbl[chr(151)] = '—' ; // em dash
$trans_tbl[chr(152)] = '˜' ; // small tilde
$trans_tbl[chr(153)] = '™' ; // trademark
$trans_tbl[chr(154)] = 'š' ; // small s caron
$trans_tbl[chr(155)] = '›' ; // right angle quote
$trans_tbl[chr(156)] = 'œ' ; // oe ligature
$trans_tbl[chr(158)] = 'ž' ; // small z caron
$trans_tbl[chr(159)] = 'Ÿ' ; // Y with diaeresis
for ( $i=160; $i<=255; $i++ ) {
$trans_tbl[chr($i)] = '&#' . $i . ';' ;
}
//$trans_tbl[chr(8216)] = "'" ; // single smart quote left
//$trans_tbl[chr(8217)] = "'" ; // single smart quote right
//$to_return = str_replace('?', "'", $to_return);
$to_return = strtr ( $body , $trans_tbl );
// now handle ampersands where they occur all by itself.
$to_return = preg_replace('/ & /', ' & ', $to_return);
return $to_return;
}