I'm new to PHP and currently learning it by myself.
I need to build a site for updating news. However, the site is required to be in multi-language support. This means posts on a page could be in various type of language (including Chinese, Japanese, Korean, ...). As far as I learned, I need to save my database with utf8 character type, but displaying a language need to be encoded in that language, or else it'll be junk characters, and with more than 1 language, it's impossible to encode each tiem just to read a page.
I wonder if I could get some help here.
Thanks in advance.
multi language page
Moderator: General Moderators
Re: multi language page
There's no such thing as 'encoding in a language'. A language is not (or does not define) a way to encode (binary represent) text.
Regarding utf8, this is a unicode encoding, meaning it can hold text in ANY language, including Japanese, Chinese, Korean, and Klingon. Also multiple languages combined in one page are no problem as long as you stick tot utf8.
Regarding utf8, this is a unicode encoding, meaning it can hold text in ANY language, including Japanese, Chinese, Korean, and Klingon. Also multiple languages combined in one page are no problem as long as you stick tot utf8.
Re: multi language page
it's not what i meant.
for example, I have 2 post in a page. 1 is chinese, 1 is japanese.
when i open it with a browser, it does not automatically recognize the the language in the post, so what it displays is junk characters. In order to read it, I need to tell the browser to read the page in a specific language (which is View -> Encoding in IE).
I don't know how to make the browser recognize the language within.
for example, I have 2 post in a page. 1 is chinese, 1 is japanese.
when i open it with a browser, it does not automatically recognize the the language in the post, so what it displays is junk characters. In order to read it, I need to tell the browser to read the page in a specific language (which is View -> Encoding in IE).
I don't know how to make the browser recognize the language within.
Re: multi language page
This only happens if you encode the text using an encoding suitable for 1 language (for example iso-2022-jp or euc-kr), which means the other language can't even be expressed in that encoding (thus becomes junk if you try it anyway).donki wrote:for example, I have 2 post in a page. 1 is chinese, 1 is japanese.
when i open it with a browser, it does not automatically recognize the the language in the post, so what it displays is junk characters.
If you use unicode (where utf8 is the most obvious encoding choice) this problem does not occur.
Make sure to clearly specify the encoding you're using in your HTML header, e.g. useIn order to read it, I need to tell the browser to read the page in a specific language (which is View -> Encoding in IE).
I don't know how to make the browser recognize the language within.
Code: Select all
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>Re: multi language page
I tried to make a sample file with Korean characters like this
The page is already encoded in Unicode, but somehow all Korean characters become junk. Unless I forced it to read the page in Korean, these characters would still be junk like that.
Code: Select all
<head>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
</head>
<body>
<br>
<b>Name: </b>ㄹㅈㄷㄱㅈㄷ<br>
<b>Location: </b>마음을<br>
<b>Email: </b>ㅈㄷㄱㅁㅈㄷㄻ<br>
<b>URL: </b>ㅊ<br>
<b>Comments: </b>ㅁㄹㅈㄷㄹㄴㅇㄹㄴㅇㄹㅈ<br>
<br>
<br>
<h2><a href="sign.php">Sign in my Guestbook</a></h2>
</body>
Re: multi language page
Your page does indeed contain a header that specifies unicode (utf8), but is the data itself actually utf8-encoded? That is, do you know how your editor, in which you editted that particular .html file, saves its content?donki wrote:I tried to make a sample file with Korean characters like this
The page is already encoded in Unicode, but somehow all Korean characters become junk. Unless I forced it to read the page in Korean, these characters would still be junk like that.
(the fact that the characters are displayed correctly here on this forum page, doesn't say much about that)
To verify, can you rename it to .php and change it into this:
Code: Select all
<?php
$s = "ㄹㅈㄷㄱㅈㄷ";
$s .= " = ".bin2hex($s);
print("<html><head>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
</head><body>
Name: $s
</body></html>");
?>Re: multi language page
If you are going to store Chinese, Japanese and Korean texts, I would recommend UTF-16. UTF-8 is very space-consuming for these languages.