PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
How do I use string manipulation functions with UTF-8 characters? Here's a little function I wrote which truncates passed string to indicated length without chopping part of the last word off:
Of course, that's because, as feyd pointed out, it only works with single byte chars (the above works just fine with standard chars). How do you manipulate multibyte character strings then? It's extremely important to me, since ALL of the sites I'm going to develop in the future (and the one I'm doing atm as well) will use UTF-8. Maybe there's some simple solution I do not know.
Not all languages are EN... In my country, almost all websites are multilingual - they same content is usually presented in Lithuanian, Russian and English languages. So if I can truncate EN news items, I need to be able to do the same with RU and LT ones.
Is there another way you could suggest to make the function I posted work with language-specifc chars? Maybe storing strings in the db as htmlentities?
Ree wrote:Not all languages are EN... In my country, almost all websites are multilingual - they same content is usually presented in Lithuanian, Russian and English languages. So if I can truncate EN news items, I need to be able to do the same with RU and LT ones.
If that's the case then it's quite likely your hosting company will have enabled mbstring. The best way to find out is to try some of the functions and see if you get an error or not.
I very much need to make it work on all standard hosts. Local hosts in my country aren't cheap, so they usually host sites of bigger companies. The cheap hosts over here are mere resellers (the physical host is in US usually).
it's not all that hard to create your own utf-8 parser..
well.. there's this: viewtopic.php?t=36549 which, although not exactly what you need, has references to the texts to read about UTF8 encodings among other details.. You could also reverse its logic creating a UTF8 to HTML entity conversion..