Language : ISO / UTF ?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
anjanesh
DevNet Resident
Posts: 1679
Joined: Sat Dec 06, 2003 9:52 pm
Location: Mumbai, India

Language : ISO / UTF ?

Post by anjanesh »

Im trying to get a site of mine display in indian languages.
Im new to this utf / iso language concept.
I understand for hindi its utf-8 but whats the font specified ? By sending Content utf-8 the font will be automatically detected or do we have to specify one ?
BTW, I checked up W3C and unicode - couldn't find anything for Tamil and Malayalam characters - are there any specs on these ? Or just simple font change in HTML tags ?
Thanks
User avatar
n00b Saibot
DevNet Resident
Posts: 1452
Joined: Fri Dec 24, 2004 2:59 am
Location: Lucknow, UP, India
Contact:

Post by n00b Saibot »

see utf has definition for characters of many langauges.
if simply start the charmap program in windows you can see them in the font Arial Unicode MS in the Devanagri section. :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

you have to specify that the page is of a certain charset if you want proper submissions and reading. In your case, you need to specify the content as being utf8 encoded.
User avatar
anjanesh
DevNet Resident
Posts: 1679
Joined: Sat Dec 06, 2003 9:52 pm
Location: Mumbai, India

Post by anjanesh »

Ok. So font doesnt have to be specified.
In this example why is Test being shown as Test in English ? Shouldnt it be shown with the Hindi characters ?

Code: Select all

<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<HTML XMLNS="http://www.w3.org/1999/xhtml" XML:LANG="hi" LANG="hi" DIR="ltr">
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=hindi-utf-8" />
<TITLE>Hindi</TITLE>
</HEAD>
<BODY>Test</BODY>
</HTML>
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Not completely certain - but the encoding specifies the character set you used. So assuming you wrote using utf-8 compatible characters, the browser should show them correctly.

However Test are not hindi characters, are they? So why would your encoding alter how they are viewed? Western/Roman characters are part of utf-8 too...so they'll be displayed as is. You need to literally write in hindi characters with a unicode font setup on your editor I assume...
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

User avatar
anjanesh
DevNet Resident
Posts: 1679
Joined: Sat Dec 06, 2003 9:52 pm
Location: Mumbai, India

Post by anjanesh »

Im using a plain text-editor.
Im trying to understand this using the the script in phpMyAdmin 2.6.0 pl3 in the file : lang/hindi-utf-8.inc.php.
I opened it in the editor and saw the chars. Take for example on Line 30:
$strAction = ' कारà¥
User avatar
n00b Saibot
DevNet Resident
Posts: 1452
Joined: Fri Dec 24, 2004 2:59 am
Location: Lucknow, UP, India
Contact:

Post by n00b Saibot »

anjanesh wrote:But I dont expect to type the entire sentence using the dec/hex codes.
That's the way it is baybee.
if you wanna type normally from keyboard you gotta use some of the readily available hindi fonts. :wink:
User avatar
anjanesh
DevNet Resident
Posts: 1679
Joined: Sat Dec 06, 2003 9:52 pm
Location: Mumbai, India

Post by anjanesh »

But according to this : http://www.alanwood.net/unicode/devanagari.html
A stands for अ
AA stands for आ
So is there anyway to type AA in a plain text editor and get output आ in the browser ?
Last edited by anjanesh on Sat Feb 19, 2005 9:02 am, edited 1 time in total.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

if the editor supports unicode, you can do the alt-keypad typing, as is, a unicode editor with typing AA will produce four bytes instead of two, typically.
User avatar
anjanesh
DevNet Resident
Posts: 1679
Joined: Sat Dec 06, 2003 9:52 pm
Location: Mumbai, India

Post by anjanesh »

Im not looking for to type आ in a text editor. Something like this (say in notepad or Context):

Code: Select all

<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=utf-8" />... and all that
<?
$strLetter="AA";
echo $strLetter;
?>
The output in any browser (IE / FF) should be आ
User avatar
n00b Saibot
DevNet Resident
Posts: 1452
Joined: Fri Dec 24, 2004 2:59 am
Location: Lucknow, UP, India
Contact:

Post by n00b Saibot »

anjanesh wrote:But according to this : http://www.alanwood.net/unicode/devanagari.html
A stands for अ
AA stands for आ
So is there anyway to type AA in a plain text editor and get output आ in the browser ?
Hey, where does it say that A 'stands for' अ , huh?
What it only says is this that A 'is Devanagri Name for' अ. OKay :!:
Not the character for typing अ . :roll:
Post Reply