Page 1 of 1
PHP Charset
Posted: Thu Dec 03, 2009 2:22 am
by kirank
I need to implement an SMPP Connection for sending SMS.
I need to send the details with Unicode Support
What i have to do is, encode the codepoint(hex code) of the unicode character to 2 byte.
Ie convert to 16 bits..
How can i do this, i there any function to convert the hex to 16 bit..
Thanks
Re: PHP Charset
Posted: Thu Dec 03, 2009 3:36 am
by Apollo
Code: Select all
$c = '20AC';
$x = hexdec($c); // $x is now 0x20AC = 8364, the unicode codepoint for '€' (the euro sign)
If you need the bytes separately:
Code: Select all
$x1 = $x & 255; // $x1 is now 0xAC = 172
$x2 = ($x>>8) & 255; // $x2 is now 0x20 = 32
Re: PHP Charset
Posted: Thu Dec 03, 2009 4:14 am
by kirank
I ned to convert ths hex to binary string. Ie
like the output of pack() function.
Output will be like this �
Re: PHP Charset
Posted: Thu Dec 03, 2009 4:23 am
by kirank
In simple words, what i need is to pack one Hexcode to 16 bit (Binary string).
Re: PHP Charset
Posted: Thu Dec 03, 2009 4:59 am
by Apollo
In that case:
Code: Select all
$c = '20AC';
$x = hexdec($c);
print(chr($x & 255).chr(($x>>8) & 255));
Or, of course, simply print the output of pack:
Code: Select all
$c = '20AC';
$x = hexdec($c);
print(pack('v',$x));
Re: PHP Charset
Posted: Thu Dec 03, 2009 5:26 am
by kirank
Sorry, please check with this hex code => 0x0D21
I got the output as a3, what this means..
I have tried many times... it fails for some HEX Code,..
What i need is to convert a unicode(hex) to 16 bit binary format (binary string) , Like ��
I can do this by converting the STRINg to UCS2 using iconv(), but it fails for some languages.
So better way is to input hex code of the unicode character and convert to UCS 2( 16 bit, 2byte)
Re: PHP Charset
Posted: Thu Dec 03, 2009 5:35 am
by Apollo
Unicode codepoint 0x0D21 represents tha malayalam character 'dda' (this one:

)
If you are outputting two bytes (0x21, 0x0D) exactly how are you viewing/displaying it?
You say you get "output as a3", exactly how or where do you get or see this output?
Re: PHP Charset
Posted: Thu Dec 03, 2009 5:41 am
by kirank
I just passed 0x0D23 to the code you have given.
Let me know how can i convert this code to uCS as you told. Thats a malayalam character.
So , i need to convert 0D23 to UCS 16 bit stream.
Do you mean it has to split to 0D and 23 then convert,, how??
I am really stuck with the issue...
Re: PHP Charset
Posted: Thu Dec 03, 2009 6:12 am
by Apollo
Exactly why do you need a UCS-2 bit stream for? (there's no such thing as UCS-16 by the way, there's UCS-2 or UTF-16, and they're not the same).
Again, you say you get "output as a3", exactly how or where do you get or see this output?
Re: PHP Charset
Posted: Thu Dec 03, 2009 6:20 am
by kirank
I tried to use pack("V",hexdec(0x0D21)), then got the issue.
I need to send Unicode character over an SMPP Connection. So it accepts ucs2 bit stream.
As you told, i need to convert the hexcode of unicode to 16 bit (ie UCS2).
I need this "Unicode character 0D21 would get encoded as two bytes ... 0D 23 ..."
This is what i need . The user will input the unicode hex code, i need to conert to ucs2 binary stream and then send.
Re: PHP Charset
Posted: Thu Dec 03, 2009 6:35 am
by Apollo
kirank wrote:I tried to use pack("V",hexdec(0x0D21)), then got the issue.
beware, hexdec expects a hex
string!
hexdec(0x0D21) != hexdec('0D21') (the latter results in 0x0D21)
If you already got the codepoint as integer value (i.e. 0x0D21) rather than a hex string (i.e. '0D21') you can simply do:
By the way:
I need this "Unicode character 0D21 would get encoded as two bytes ... 0D 23 ..."
I assume you mean ... 0D 21 ...

Re: PHP Charset
Posted: Thu Dec 03, 2009 6:40 am
by kirank
OK, still now my issue not cleareed
see print( pack('v',0x0D21)); => returns " ! " as output.
Please carefully read my issue.
I need to convert unicode 0D21 to UCS2 (16 bit binary)
Any way to do this??
Re: PHP Charset
Posted: Thu Dec 03, 2009 6:54 am
by Apollo
kirank wrote:OK, still now my issue not cleareed
see print( pack('v',0x0D21)); => returns " ! " as output.
print( pack('v',0x0D21)) outputs two bytes: 0x21 0x0D
This is only "! " if you interpret it as UTF-8 or Ansi (which is incorrect, because it's supposed to be interpreted as UCS-2).
When you say it outputs "! ", exactly how or where do you see that? In your browser? That's wrong, because without HTML header your browser will implicitly assume UTF-8 or some random Ansi encoding.
So looking at the output in your browser is NOT a correct way to check your result.
I need to convert unicode 0D21 to UCS2 (16 bit binary)
Really, pack('v',0x0D21) does exactly that, and printing it will output those two bytes.
Re: PHP Charset
Posted: Thu Dec 03, 2009 7:01 am
by kirank
OK, i accept it.
So i can pack to big endian using "n" option right??
Let me check with the transfer..
Re: PHP Charset
Posted: Thu Dec 03, 2009 7:11 am
by Apollo
kirank wrote:So i can pack to big endian using "n" option right??
Correct, see also the specifications in the
PHP manual.