utf 8 characters problem

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
martinpmf
Forum Newbie
Posts: 5
Joined: Sat Oct 17, 2009 6:01 pm

utf 8 characters problem

Post by martinpmf »

i'm new to this forum so i'm sorry if this topic is discussed before, i can't find it (i'll ask the administrators to move this post to the right place if it's misplaced )
Anyway i'm trying to read from .txt file witch contains utf 8 encoded strings,each single one in a row, enter, then the others till the end with "enters" between them. Then i'm trying to get the last character from each string and if it's equal to some other character, then i like it to echo that string. But there seems to be a problem in the comparing and my "if" is not working properly. I guess there are same extra invisible characters added because of the encoding. Here is the code:

Code: Select all

 
$tekst=fopen("text.txt","r");
while(!feof($tekst)):
$str=fgets($tekst);
$tmp1=UTF8::strlen($str); //gets the length of a string  
$tmp=UTF8::substr($str, $tmp1-3, $tmp1-2); //gets the last character of the string
if ($tmp == "?"){
    echo $str;
    }
endwhile;
fclose($tekst);
 

When i echo $tmp it shows me "?", but when i compare it in "if" it doesn't work as i want. Any idea why php is not handling the encoding as it should. Are the "enters" problem? Are the bytes in the encoding problem?

Thanks for the answer in advance...
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: utf 8 characters problem

Post by Mark Baker »

Not without seeing your UTF8 class, and the methods in it like substr
Why can't you simply use PHP's own functions?
guosheng1987
Forum Newbie
Posts: 24
Joined: Thu Oct 15, 2009 3:03 am

Re: utf 8 characters problem

Post by guosheng1987 »

maybe you can use the function of "iconv"
martinpmf
Forum Newbie
Posts: 5
Joined: Sat Oct 17, 2009 6:01 pm

Re: utf 8 characters problem

Post by martinpmf »

Because the normal functions of PHP5 don't work correctly for UTF8 as for alphabet :S
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: utf 8 characters problem

Post by Mark Baker »

martinpmf wrote:Because the normal functions of PHP5 don't work correctly for UTF8 as for alphabet :S
The correct normal functions do work. e.g. mb_strlen() and mb_substr() or iconv_strlen() and iconv_substr()
We can't see what your UTF8 Class does, so it's very difficult for us to offer any other help.
martinpmf
Forum Newbie
Posts: 5
Joined: Sat Oct 17, 2009 6:01 pm

Re: utf 8 characters problem

Post by martinpmf »

UTF8 class that i found on the net, works with mb functions, and works for some of the things I need, i can't post it because its 800 lines :s
martinpmf
Forum Newbie
Posts: 5
Joined: Sat Oct 17, 2009 6:01 pm

Re: utf 8 characters problem

Post by martinpmf »

here i simplify:
<?php
$tekst=fopen("text.txt","r");
while(!feof($tekst)){
$tmp=fgets($tekst);
$dolzina=mb_strlen($tmp);
echo $dolzina.$tmp."<br />";
if ($tmp == "мартин"){
echo "ok"."<br />";
} else {
echo "tapa"."<br />";
}
}
fclose($tekst);
?>
and the output:
17мартин// where 17 is the lenght, and it should be 6
tapa
8martin
tapa
10горги
tapa
Mark Baker
Forum Regular
Posts: 710
Joined: Thu Oct 30, 2008 6:24 pm

Re: utf 8 characters problem

Post by Mark Baker »

And what do you get if you use

Code: Select all

$dolzina=mb_strlen($tmp,'UTF-8');
martinpmf
Forum Newbie
Posts: 5
Joined: Sat Oct 17, 2009 6:01 pm

Re: utf 8 characters problem

Post by martinpmf »

thanks Mark,
I just needed to add mb_internal_encoding("UTF-8") on the beginning and all work well :)
Post Reply