MB Encoding to UTF-8 in PHP 5.0.5
Posted: Mon Jan 09, 2006 12:44 pm
Hey,
I'm having to convert emails of several different Japanese character sets to UTF-8 and the mb_convert_encoding() func is failing to convert two thirds of the them.
I'm parsing the email to get the original character set and then doing this:
$utf_contents = mb_convert_encoding($raw_contents, "UTF-8", "$encoding");
To check if the conversion was successful, I'm just making sure that $utf_contents != $raw_contents.
ISO-2022-JP converts most of the time (513 out of 596 were successful), but SHIFT_JIS fails most of the time (30 out of 461 were successful), and the auto option fails the most (only 8 out of 620 converted successfully)...
To make sure I wasn't dealing with a bug in the MB funcs, I ran a second test against only the emails that converted successfully and they all converted fine. So, the MB funcs seem to be working as designed...
Has anyone dealt with this before? Can you offer any advice to increase the success rate?
Thanks,
~Scott
I'm having to convert emails of several different Japanese character sets to UTF-8 and the mb_convert_encoding() func is failing to convert two thirds of the them.
I'm parsing the email to get the original character set and then doing this:
$utf_contents = mb_convert_encoding($raw_contents, "UTF-8", "$encoding");
To check if the conversion was successful, I'm just making sure that $utf_contents != $raw_contents.
ISO-2022-JP converts most of the time (513 out of 596 were successful), but SHIFT_JIS fails most of the time (30 out of 461 were successful), and the auto option fails the most (only 8 out of 620 converted successfully)...
To make sure I wasn't dealing with a bug in the MB funcs, I ran a second test against only the emails that converted successfully and they all converted fine. So, the MB funcs seem to be working as designed...
Has anyone dealt with this before? Can you offer any advice to increase the success rate?
Thanks,
~Scott