Page 1 of 1
Help with Simple UTF-8 Solution
Posted: Sat Oct 11, 2008 8:11 pm
by Eidetic
I've read quite a bit about UTF-8 so far and to be honest every member in every forum seems to have a different opinion and seemingly vague implementation and none have worked out real well so far. For the sake of simplicity, can someone help me get the following code working to help make a clear statement of a unicode workaround in PHP 5?
Code: Select all
$uniWord = "în?bu?i"; //hopefully this appears with unicode characters in your browsers
for ($i = strlen($uniWord); $i > 0; $i--)
{
switch (substr($uniWord, $i, 1))
{
case "î":
case "?":
case "?":
return true;
default:
return false;
}
}
Re: Help with Simple UTF-8 Solution
Posted: Wed Oct 15, 2008 11:20 am
by Eidetic
Can it not be done as such? Should I just replace unicode characters in php as they come in from the html form and then use the html codes to print back to the client #&nnn; ?
Re: Help with Simple UTF-8 Solution
Posted: Wed Oct 15, 2008 2:26 pm
by requinix
Have you looked into
utf8_encode and
_decode?
Re: Help with Simple UTF-8 Solution
Posted: Wed Oct 15, 2008 3:39 pm
by Ambush Commander
You are going to need a multibyte string iterator, which leaves you with two options: the mbstring extension, or a
pure-PHP UTF-8 library.
Re: Help with Simple UTF-8 Solution
Posted: Thu Oct 16, 2008 2:02 pm
by Eidetic
Thank you for the replies.
Tasairis, I had looked into those methods but had also run into some problems when utilizing simple code such as the one posted. But there are some people on that board with some great ideas.
Ambush Commander, I like those solutions. The Pure PHP UTF-8 Library looks perfect. The server I'm running through is the university's and so I can't implement anything additional or make any ini changes (or even turn on error checking), which is unfortunate. But at least I know that I only have two options if I want to iterate over unicode characters.
As a workaround, what do you guys think of the following:
replace unicode with non-unicode characters, using them as placeholders for string manipulation. when finished, print back to the browser using the html #&xxx? It's what I had originally planned on doing but wasn't (and still am not) excited about.
Re: Help with Simple UTF-8 Solution
Posted: Thu Oct 16, 2008 2:04 pm
by Ambush Commander
Try the PHP UTF-8 library. It'll be a lot nicer.