unicode in php

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
sinasalek
Forum Newbie
Posts: 6
Joined: Tue Jun 29, 2004 4:39 pm

unicode in php

Post by sinasalek »

hi
i have a problem with unicode in PHP.
you know we can use preg_split function for converting a string to an array, so i need this function, but i can't use it with unicode strings.
my code is :

-----------------------
<html>
<head>
<meta http-equiv="Content-Language" content="fa">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>

Code: Select all

<?php
$str='&#1604;&#1740;&#1587;&#1578; &#1580;&#1583;&#1608;&#1604; &#1607;&#1575;'; //UTF8,language is farsi
$t=preg_split ( '//' , $str); 
foreach ($t as $key => $value)
{
 echo "<p> $value </p>";	
}
?>
-------------------------
i can not use this function, because some unicode characters have two bytes. for example character "Le" in farsi.
when i use this code <? echo "<p> $t[1] </p>"; ?> i can not see "Le" character, becuase it have two bytes, but when i use <? echo "<p> $t[1]$t[2] </p>"; ?> i can see unicode character.
please help me, i do not know how can i solve this problem.
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

[php_man]mb_split[/php_man] ?
sinasalek
Forum Newbie
Posts: 6
Joined: Tue Jun 29, 2004 4:39 pm

Post by sinasalek »

i check split function, but i saw this in php manual ->
For users looking for a way to emulate Perl's @chars = split('', $str) behaviour, please see the examples for preg_split().
i could not find mb_preg_split function for unicode strings.

i test mb_split function :

Code: Select all

<?php
  $str='&#1592;&#8222;&#1594;&#338;&#1591;³&#1591;&#1726; &#1591;¬&#1591;¯&#1592;&#710;&#1592;&#8222; &#1592;&#8225;&#1591;§';
  $t=mb_split('[]' , $str); //string to array of characters
  print_r($t);
?>
after i run above code i see this!
"Warning: mb_split(): mbregex compile err: invalid regular expression; empty character class"
i need a function like preg_split for unicode strings.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

the error told you exactly what's wrong.. [] is an empty character class.. which will fail a regex compile..
Post Reply