I'm trying to make a small CMS experiment that supports non lating languages.
the front end uses UTF-8 as charset correctly
the database collation is utf8_general_ci
the table collation is utf8_general_ci
the field collation is utf8_general_ci
should I use utf8_unicode_ci instead of utf8_general_ci?
platform is
phpMyAdmin 2.6.1
MySQL 4.1.9
apache 1.3.33
php 4.3.10
Firefox 2 / IE 7.0.5730.11
but when I post the contents from my CMS it convert the characters with errors, of course when I do it from the phpMyAdmin it save the characters correctly.
a resume of the HTML front-end
Code: Select all
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<form name="THENAME" action="" method="POST">
<input type="hidden" name="rowid" value="MY VALUE">
<input type="hidden" name="parent" value="MY VALUE">
<textarea name="my_content" rows="20" style="width:98%;" title="USE HTML TAGS HERE"></textarea>
<input name="edit_confirm" value="Confirm edition" type="submit">
</form>
</body>
</html>Code: Select all
$edit_query = "UPDATE `my_table` SET
parent = '".$_POST['parent']."',
my_content = ".process_html_text($_POST['my_content'])."
WHERE rowid='".$_POST['rowid']."'";
if(sql_select($edit_query,$edit_results)){ // my abstraction sql_select() use mysql_query() to store data (it also check connection and handle errors)
echo "SUCCESS MESSAGE";
}
else{
echo "MY ERROR MESSAGE";
// AND MY LOG ERROR SCRIPT
}
############################################### PROCESS HTML text for db query
function process_html_text($string){
//process_textarea_text
// textarea tag was previously transformed when rendered on the textarea to
//prevent nested textareas and now is stored correctly
$patterns = array (
"#'#",
'#\[\s{0,}textarea#is', //OPEN TEXTAREA
'#\[\s{0,}/\s{0,}textarea\s{0,}\]#is'//CLOSE
);
$substitutions = array(
"\'",
'<textarea',
'</textarea>'
);
$output_string = preg_replace($patterns,$substitutions,trim(stripslashes($string)));
return "$output_string";
}of course, if I use latin charset on the frontend and the DB collation the characters are automatically transformed into /&#(\d+);/ decimal format but by doing this, the stored information will need to be rendered in HTML to make it undersundable
for example
* 网页
* 资讯
* 知识
* 音乐
* 图片
* 影视
* 酷帖
* 更多
is the rendered code of
Code: Select all
* 网页
* 资讯
* 知识
* 音乐
* 图片
* 影视
* 酷帖
* 更多anyone can help me about it?
is there a query error? or what?