parsing a UTF-8 string
Posted: Wed Apr 05, 2006 10:59 am
Hello people
I have a question regarding the correct manner in which a UTF-8 string should be parsed
I have a UTF-8 string of the format:
<ch1>,<ch2>,<ch3>,<ch4>
Where <c1>...<c4> are characters in a given language
These characters could be in:
- English
- French
- Hebrew
- Spanish
and so on
Any Latin Language should work
I need to be able to parse this string into an array
I thought about using function explode and using the , as a separator
but I was wondering whether that function will be safe for the UTF-8 string
This is because I cannot assume that each character will take a single byte
any ideas?
I have a question regarding the correct manner in which a UTF-8 string should be parsed
I have a UTF-8 string of the format:
<ch1>,<ch2>,<ch3>,<ch4>
Where <c1>...<c4> are characters in a given language
These characters could be in:
- English
- French
- Hebrew
- Spanish
and so on
Any Latin Language should work
I need to be able to parse this string into an array
I thought about using function explode and using the , as a separator
but I was wondering whether that function will be safe for the UTF-8 string
This is because I cannot assume that each character will take a single byte
any ideas?