Page 1 of 1

PHP Charset

Posted: Thu Dec 03, 2009 2:22 am
by kirank
I need to implement an SMPP Connection for sending SMS.
I need to send the details with Unicode Support
What i have to do is, encode the codepoint(hex code) of the unicode character to 2 byte.
Ie convert to 16 bits..

How can i do this, i there any function to convert the hex to 16 bit..

Thanks

Re: PHP Charset

Posted: Thu Dec 03, 2009 3:36 am
by Apollo

Code: Select all

$c = '20AC';
$x = hexdec($c); // $x is now 0x20AC = 8364, the unicode codepoint for '€' (the euro sign)
If you need the bytes separately:

Code: Select all

$x1 = $x & 255; // $x1 is now 0xAC = 172 
$x2 = ($x>>8) & 255; // $x2 is now 0x20 = 32

Re: PHP Charset

Posted: Thu Dec 03, 2009 4:14 am
by kirank
I ned to convert ths hex to binary string. Ie
like the output of pack() function.

Output will be like this �

Re: PHP Charset

Posted: Thu Dec 03, 2009 4:23 am
by kirank
In simple words, what i need is to pack one Hexcode to 16 bit (Binary string).

Re: PHP Charset

Posted: Thu Dec 03, 2009 4:59 am
by Apollo
In that case:

Code: Select all

$c = '20AC';
$x = hexdec($c);
print(chr($x & 255).chr(($x>>8) & 255));
Or, of course, simply print the output of pack:

Code: Select all

$c = '20AC';
$x = hexdec($c);
print(pack('v',$x));

Re: PHP Charset

Posted: Thu Dec 03, 2009 5:26 am
by kirank
Sorry, please check with this hex code => 0x0D21

I got the output as a3, what this means..


I have tried many times... it fails for some HEX Code,..

What i need is to convert a unicode(hex) to 16 bit binary format (binary string) , Like ��


I can do this by converting the STRINg to UCS2 using iconv(), but it fails for some languages.

So better way is to input hex code of the unicode character and convert to UCS 2( 16 bit, 2byte)

Re: PHP Charset

Posted: Thu Dec 03, 2009 5:35 am
by Apollo
Unicode codepoint 0x0D21 represents tha malayalam character 'dda' (this one: Image)

If you are outputting two bytes (0x21, 0x0D) exactly how are you viewing/displaying it?

You say you get "output as a3", exactly how or where do you get or see this output?

Re: PHP Charset

Posted: Thu Dec 03, 2009 5:41 am
by kirank
I just passed 0x0D23 to the code you have given.

Let me know how can i convert this code to uCS as you told. Thats a malayalam character.

So , i need to convert 0D23 to UCS 16 bit stream.

Do you mean it has to split to 0D and 23 then convert,, how??

I am really stuck with the issue...

Re: PHP Charset

Posted: Thu Dec 03, 2009 6:12 am
by Apollo
Exactly why do you need a UCS-2 bit stream for? (there's no such thing as UCS-16 by the way, there's UCS-2 or UTF-16, and they're not the same).

Again, you say you get "output as a3", exactly how or where do you get or see this output?

Re: PHP Charset

Posted: Thu Dec 03, 2009 6:20 am
by kirank
I tried to use pack("V",hexdec(0x0D21)), then got the issue.

I need to send Unicode character over an SMPP Connection. So it accepts ucs2 bit stream.
As you told, i need to convert the hexcode of unicode to 16 bit (ie UCS2).

I need this "Unicode character 0D21 would get encoded as two bytes ... 0D 23 ..."

This is what i need . The user will input the unicode hex code, i need to conert to ucs2 binary stream and then send.

Re: PHP Charset

Posted: Thu Dec 03, 2009 6:35 am
by Apollo
kirank wrote:I tried to use pack("V",hexdec(0x0D21)), then got the issue.
beware, hexdec expects a hex string!

hexdec(0x0D21) != hexdec('0D21') (the latter results in 0x0D21)

If you already got the codepoint as integer value (i.e. 0x0D21) rather than a hex string (i.e. '0D21') you can simply do:

Code: Select all

pack('v',0x0D21)

By the way:
I need this "Unicode character 0D21 would get encoded as two bytes ... 0D 23 ..."
I assume you mean ... 0D 21 ... ;)

Re: PHP Charset

Posted: Thu Dec 03, 2009 6:40 am
by kirank
OK, still now my issue not cleareed

see print( pack('v',0x0D21)); => returns " ! " as output.

Please carefully read my issue.

I need to convert unicode 0D21 to UCS2 (16 bit binary)

Any way to do this??

Re: PHP Charset

Posted: Thu Dec 03, 2009 6:54 am
by Apollo
kirank wrote:OK, still now my issue not cleareed

see print( pack('v',0x0D21)); => returns " ! " as output.
print( pack('v',0x0D21)) outputs two bytes: 0x21 0x0D

This is only "! " if you interpret it as UTF-8 or Ansi (which is incorrect, because it's supposed to be interpreted as UCS-2).

When you say it outputs "! ", exactly how or where do you see that? In your browser? That's wrong, because without HTML header your browser will implicitly assume UTF-8 or some random Ansi encoding.
So looking at the output in your browser is NOT a correct way to check your result.
I need to convert unicode 0D21 to UCS2 (16 bit binary)
Really, pack('v',0x0D21) does exactly that, and printing it will output those two bytes.

Re: PHP Charset

Posted: Thu Dec 03, 2009 7:01 am
by kirank
OK, i accept it.

So i can pack to big endian using "n" option right??
Let me check with the transfer..

Re: PHP Charset

Posted: Thu Dec 03, 2009 7:11 am
by Apollo
kirank wrote:So i can pack to big endian using "n" option right??
Correct, see also the specifications in the PHP manual.