dealing with multilingual data
Posted: Fri Sep 17, 2004 6:32 pm
hi
(windows xp with apache 1.3 and PHP Version 4.3.5)
i have a form with single textbox...
i enter in this textbox non-english characters from two languages (russian nad hebrew) at the same time...
i submit the form...
and then dump the recieved string with print_r() just to see what i recieved...
and it looks ok...
then i save the string to file with fwrite()...
but one of alphabets is lost...
i'm guessing it is related to codepage that fwrite() uses...
my windows OEM codepage set to russian and in the resulting file i see that hebrew characters become russian characters...
I experimented with htmlentities()...
manual says it should encode supported charsets like %#number.
I tried it on my string with hebrew and russian characters together... russian characters were encoded correctly but not hebrew characters...
the manual page for this functions says that it supports:
(Anyway i'm not sure that htmlentities() is a solution for multilingual data...)
Would appreciate any hints on the subject... any links to relevant resources...
thanks...
(windows xp with apache 1.3 and PHP Version 4.3.5)
i have a form with single textbox...
i enter in this textbox non-english characters from two languages (russian nad hebrew) at the same time...
i submit the form...
and then dump the recieved string with print_r() just to see what i recieved...
and it looks ok...
then i save the string to file with fwrite()...
but one of alphabets is lost...
i'm guessing it is related to codepage that fwrite() uses...
my windows OEM codepage set to russian and in the resulting file i see that hebrew characters become russian characters...
I experimented with htmlentities()...
manual says it should encode supported charsets like %#number.
I tried it on my string with hebrew and russian characters together... russian characters were encoded correctly but not hebrew characters...
the manual page for this functions says that it supports:
so hebrew characters fall under this category(they can be mapped to 256 charset), yet htmlentities () doesn't encode them......
Following character sets are supported in PHP 4.3.0 and later.
(my php version is 4.3.5...)
UTF-8 ASCII compatible multi-byte 8-bit
...
(Anyway i'm not sure that htmlentities() is a solution for multilingual data...)
Would appreciate any hints on the subject... any links to relevant resources...
thanks...