Page 1 of 1

parsing a UTF-8 string

Posted: Wed Apr 05, 2006 10:59 am
by jasongr
Hello people

I have a question regarding the correct manner in which a UTF-8 string should be parsed
I have a UTF-8 string of the format:
<ch1>,<ch2>,<ch3>,<ch4>
Where <c1>...<c4> are characters in a given language
These characters could be in:
- English
- French
- Hebrew
- Spanish
and so on
Any Latin Language should work
I need to be able to parse this string into an array

I thought about using function explode and using the , as a separator
but I was wondering whether that function will be safe for the UTF-8 string
This is because I cannot assume that each character will take a single byte

any ideas?

Posted: Wed Apr 05, 2006 12:18 pm
by feyd
explode() doesn't care about the number of bytes in each element it generates. :?

Posted: Wed Apr 05, 2006 12:30 pm
by jasongr
thanks for the tip