Vietnamese charset not working?

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
azhan
Forum Commoner
Posts: 68
Joined: Fri Jun 27, 2008 6:05 am

Vietnamese charset not working?

Post by azhan »

Hello guys,

I'm making language option in my website where user can change the language they preferred, which I have one language file that defines the Vietnamese.

My problem is, some words of Vietnamese are not supported by charset that I had set. Means not all Vietnamese words can be displayed by the charset that I set. How to tackle this problem?

I had already try use all the charset related such as UTF-8, Western(ISO-8859-1), TCVN, VISCII, VPS and windows - 1258. But not all of the words can be supported.

Help me....
User avatar
Apollo
Forum Regular
Posts: 794
Joined: Wed Apr 30, 2008 2:34 am

Re: Vietnamese charset not working?

Post by Apollo »

iso-8859-1 obviously isn't gonna work, as there are no vietnamese characters in there at all.

When you say you tried all kinds of different charsets, what did you do exactly? Simply changing the content-type header in the HTML is not going to help of course, as that doesn't change the actual encoding of the content you're outputting (it only tells the browser "I'm gonna send you utf-8 encoded content now", but then if you send content which is actually windows-1258 encoded, you'll get garbage).

Generally I would recommend using utf-8 in any case. If vietnamese text is still being displayed incorrectly, the html you're outputting is not utf-8 encoded, and you need to convert it. If your content comes from a database, you need to convert that (and make sure the tables and connection use the correct collation, e.g. utf8_general_ci).
azhan
Forum Commoner
Posts: 68
Joined: Fri Jun 27, 2008 6:05 am

Re: Vietnamese charset not working?

Post by azhan »

Thanks for replying

I did not store the word in the database, what i do is i create a php file using define method like below
Let say i want to convert the word "login" into vietnamese

English -> Vietnamese
Login -> Truy nhập

if i replace the word "Truy nhập" directly into define like below,

define("LOGIN","Truy nhập");

This did not work where the letter 'ậ' will display '<?>' on return.

BUT, if I convert the vietnamese word into htmlentities using php like $word = htmlentities($word);
then i replace the converted word into define like below, it works!

define("LOGIN","Truy nhập");

i use charset utf-8.
User avatar
Apollo
Forum Regular
Posts: 794
Joined: Wed Apr 30, 2008 2:34 am

Re: Vietnamese charset not working?

Post by Apollo »

azhan wrote:i use charset utf-8.
Where, in the editor which you're using to edit your .php files?


It's a bad idea to store non-ascii chars explicitly in php files, because they don't have one universally defined encoding (or at least not one commonly respected by all editors and IDEs). PHP files can be considered binary, and whatever bytes your (or your OS/2 using uncle's) editor decides to write in there to represent some exotic character, are being output as such.

Try this instead:

Code: Select all

define("LOGIN","Truy nh".chr(0xe1).chr(0xba).chr(0xad)."p"); // the utf-8 representation of the string in encoding-independent PHP
newhope
Forum Newbie
Posts: 2
Joined: Mon Jan 12, 2009 4:47 am

Re: Vietnamese charset not working?

Post by newhope »

azhan wrote:Thanks for replying

I did not store the word in the database, what i do is i create a php file using define method like below
Let say i want to convert the word "login" into vietnamese

English -> Vietnamese
Login -> Truy nhập

if i replace the word "Truy nhập" directly into define like below,

define("LOGIN","Truy nhập");

This did not work where the letter 'ậ' will display '<?>' on return.

BUT, if I convert the vietnamese word into htmlentities using php like $word = htmlentities($word);
then i replace the converted word into define like below, it works!

define("LOGIN","Truy nhập");

i use charset utf-8.
The script contain the block
define("LOGIN","Truy nhập");
should be saved as utf8 encoding (without BOM) and you'll fine.

And dont forget
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
on your html head
Post Reply