dealing with multilingual data

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
newmember
Forum Contributor
Posts: 252
Joined: Fri Apr 02, 2004 12:36 pm

dealing with multilingual data

Post by newmember »

hi

(windows xp with apache 1.3 and PHP Version 4.3.5)

i have a form with single textbox...
i enter in this textbox non-english characters from two languages (russian nad hebrew) at the same time...
i submit the form...
and then dump the recieved string with print_r() just to see what i recieved...
and it looks ok...
then i save the string to file with fwrite()...
but one of alphabets is lost...

i'm guessing it is related to codepage that fwrite() uses...
my windows OEM codepage set to russian and in the resulting file i see that hebrew characters become russian characters...

I experimented with htmlentities()...
manual says it should encode supported charsets like %#number.
I tried it on my string with hebrew and russian characters together... russian characters were encoded correctly but not hebrew characters...
the manual page for this functions says that it supports:
...
Following character sets are supported in PHP 4.3.0 and later.
(my php version is 4.3.5...)
UTF-8 ASCII compatible multi-byte 8-bit
...
so hebrew characters fall under this category(they can be mapped to 256 charset), yet htmlentities () doesn't encode them...
(Anyway i'm not sure that htmlentities() is a solution for multilingual data...)

Would appreciate any hints on the subject... any links to relevant resources...

thanks...
Post Reply