Page 1 of 1

can anyone convert this javascript to php

Posted: Fri Nov 14, 2008 12:41 am
by Pepsi599ml
Hi, I'm wondering if it's possible to convert this javascript code to php.
The code itself converts Chinese (gb2312 encoding) characters to unicode format (&#x####;).
I know iconv can do this, but some special characters are lost during the conversion anyway.
And I found this simple javascript does the work very well, so here it is, just a line.

str.value = str.value.replace(/[^\u0000-\u00FF]/g,function($0){return escape($0).replace(/(%u)(\w{4})/gi,"&#x$2;")});

Re: can anyone convert this javascript to php

Posted: Fri Nov 14, 2008 1:38 am
by requinix
It's kinda awkward with PHP since most functions treat strings as binary data.

What code are you using for iconv?

Re: can anyone convert this javascript to php

Posted: Fri Nov 14, 2008 2:51 am
by Pepsi599ml
here is the php(iconv) code

Code: Select all

function gb2un($str) {
    preg_match_all("/[\x80-\xff]?./", $str, $m);
    $str = '';
    foreach($m[0] as $v)
        $str = $str.'&#'.utf8_unicode(iconv('gb2312', 'utf-8', $v)).';';
    return $str;
}
 
function utf8_unicode($c) {
    switch(strlen($c)) {
        case 1:
            return ord($c);
        case 2:
            $n = (ord($c[0]) & 0x3f) << 6;
            $n += ord($c[1]) & 0x3f;
            return $n;
        case 3:
            $n = (ord($c[0]) & 0x1f) << 12;
            $n += (ord($c[1]) & 0x3f) << 6;
            $n += ord($c[2]) & 0x3f;
            return $n;
        case 4:
            $n = (ord($c[0]) & 0x0f) << 18;
            $n += (ord($c[1]) & 0x3f) << 12;
            $n += (ord($c[2]) & 0x3f) << 6;
            $n += ord($c[3]) & 0x3f;
            return $n;
    }
}
it works, but not for all those special....fancy characters. so I'm hoping some guy who's good at both languages could lend me a hand. thanks

Re: can anyone convert this javascript to php

Posted: Fri Nov 14, 2008 3:34 am
by requinix
Why don't you just give the entire string over to iconv rather than selecting what parts it affects?

Are you sure you have the right encoding? Have you tried gb18030?
Otherwise (and more likely to be the problem), try

Code: Select all

preg_match_all("/[\xa1-\xff]?./", $str, $m);
The range start in the character set is a bit lower. (And the range end should actually be 0xF7, but if it ain't broke...)

Re: can anyone convert this javascript to php

Posted: Mon Nov 24, 2008 3:15 am
by Pepsi599ml
yes, it's gb2312

anyone else? please....

[edit]

ok, looking at the code carefully and I realized what I'm looking for

A php equivalent of javascript's escape function. tried urlencode but the result isn't what I expected.