[SOLVED] Help with using &#codes in new Option
Moderator: General Moderators
- Pyrite
- Forum Regular
- Posts: 769
- Joined: Tue Sep 23, 2003 11:07 pm
- Location: The Republic of Texas
- Contact:
[SOLVED] Help with using &#codes in new Option
So yea, I have like utf-8 data in a database, but since it's MySQL 4.0, and NOT 4.1, some data gets encoded and stored as &#somenumber; (dunno what you call those, first question?). If I document.write it, it works fine. But if I use the new Option object to insert it into a select box, the select just shows the &#number and not the utf-8 character. Any ideas?
Last edited by Pyrite on Sun Aug 07, 2005 9:22 am, edited 1 time in total.
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
Re: Help with using &#codes in new Option
Pyrite wrote:So yea, I have like utf-8 data in a database, but since it's MySQL 4.0, and NOT 4.1, some data gets encoded and stored as &#somenumber; (dunno what you call those, first question?). If I document.write it, it works fine. But if I use the new Option object to insert it into a select box, the select just shows the &#number and not the utf-8 character. Any ideas?
- they are called entities
- it should be simple enough to create a simple mapping/algorithm to convert them back to UTF-8 binary data..
- Pyrite
- Forum Regular
- Posts: 769
- Joined: Tue Sep 23, 2003 11:07 pm
- Location: The Republic of Texas
- Contact:
Would love anything you'll throw my way ...
Actually, just found this function now that I knew what to search for. Works perfectly for me. Khap Khun Maak Khrap!
http://www.zend.com/codex.php?id=838&single=1
Actually, just found this function now that I knew what to search for. Works perfectly for me. Khap Khun Maak Khrap!

http://www.zend.com/codex.php?id=838&single=1
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
his is rather large compared to mine, which contains unit tests along with conformance to UNICODE 4.1.0. The following was tested on PHP 5.0.4
obviously enough, to run the unit tests, uncomment line 196 (the call to testUTF())
Code: Select all
<?php
/*******************************************************************************
* This code references:
*------------------------------------------------------------------------------
* The Unicode Consortium. The Unicode Standard, Version 4.1.0, defined by:
* The Unicode Standard, Version 4.0 (Boston, MA, Addison-Wesley, 2003.
* ISBN 0-321-18578-1), as amended by
* Unicode 4.0.1 (http://www.unicode.org/versions/Unicode4.0.1) and by
* Unicode 4.1.0 (http://www.unicode.org/versions/Unicode4.1.0).
*/
function makeUTF8($match)
{
if(is_array($match))
{
$ret = makeUTF8($match[1]);
if($ret === false)
{ // the value is not a valid unicode character.
return $match[0];
}
else
{
return $ret;
}
}
else
{
$code = intval($match);
// +-----------------+-----------------+-----------------+-----------------+
// | 3 3 2 2 2 2 2 2 | 2 2 2 2 1 1 1 1 | 1 1 1 1 1 1 | |
// | 1 0 9 8 7 6 5 4 | 3 2 1 0 9 8 7 6 | 5 4 3 2 1 0 9 8 | 7 6 5 4 3 2 1 0 | bit
// +-----------------+-----------------+-----------------+-----------------+
// | | | | 0 x x x x x x x | 1 byte 0x00000000..0x0000007F
// | | | 1 1 0 y y y y y | 1 0 x x x x x x | 2 byte 0x00000080..0x000007FF
// | | 1 1 1 0 z z z z | 1 0 y y y y y y | 1 0 x x x x x x | 3 byte 0x00000800..0x0000FFFF
// | 1 1 1 1 0 w w w | 1 0 w w z z z z | 1 0 y y y y y y | 1 0 x x x x x x | 4 byte 0x00010000..0x0010FFFF
// +-----------------+-----------------+-----------------+-----------------+
// | 0 0 0 0 0 0 0 0 | 0 0 0 1 1 1 1 1 | 1 1 1 1 1 1 1 1 | 1 1 1 1 1 1 1 1 | Theoretical upper limit of legal scalars: 2097151 (0x001FFFFF)
// | 0 0 0 0 0 0 0 0 | 0 0 0 1 0 0 0 0 | 1 1 1 1 1 1 1 1 | 1 1 1 1 1 1 1 1 | Defined upper limit of legal scalar codes
// +-----------------+-----------------+-----------------+-----------------+
if($code > 1114111 or $code < 0 or ($code >= 55296 and $code <= 57343))
{ // bits are set outside the "valid" range as defined by UNICODE 4.1.0
return false;
}
else
{
$x = $y = $z = $w = 0;
if($code < 128)
{
$x = $code;
}
else
{
$x = ($code & 63) | 128;
if($code < 2048)
{
$y = (($code & 2047) >> 6) | 192;
}
else
{
$y = (($code & 4032) >> 6) | 128;
if($code < 65536)
{
$z = (($code >> 12) & 15) | 224;
}
else
{
$z = (($code >> 12) & 63) | 128;
$w = (($code >> 18) & 7) | 240;
}
}
}
$ret = '';
if($w)
{
$ret = chr($w).chr($z).chr($y);
}
elseif($z)
{
$ret = chr($z).chr($y);
}
elseif($y)
{
$ret = chr($y);
}
$ret .= chr($x);
return $ret;
}
}
}
// test stuff from here on, pretty much...
function hexerize($string)
{
$ret = '';
for($i = 0, $j = strlen($string); $i < $j; $i++)
{
$ret .= sprintf('%02X',ord($string{$i}));
}
return $ret;
}
function utfTest($code, $expectedReturn, $expectedPass = true)
{
$expect = ($expectedPass ? 'pass' : 'fail');
$ret = 'Expecting '.$expect.': ';
$utf = makeUTF8($code);
if(is_string($expectedReturn))
{
$hex = hexerize($utf);
$test = ($hex === $expectedReturn);
$hex = '('.$hex.')';
}
else
{
$hex = '';
$test = ($utf === $expectedReturn);
}
if($test)
{
$ret .= 'pass';
if(!$expectedPass)
{
$ret .= "\n\t".var_export($utf,true).$hex.' == '.var_export($expectedReturn,true);
}
}
else
{
$ret .= 'fail';
if($expectedPass)
{ // the run failed, output the returns
$ret .= "\n\t".var_export($utf,true).$hex.' != '.var_export($expectedReturn,true);
}
}
return array($test === $expectedPass,$ret);
}
function testUTF()
{
$results = array();
$results['passed'] = array();
$results['failed'] = array();
$args = array();
$args[] = array(1114112,false );
$args[] = array(1114111,'F48FBFBF'); // 0x0010FFFF
$args[] = array(1048576,'F4808080'); // 0x00100000
$args[] = array(1048575,'F3BFBFBF'); // 0x000FFFFF
$args[] = array(262144, 'F1808080'); // 0x00040000
$args[] = array(262143, 'F0BFBFBF'); // 0x0003FFFF
$args[] = array(65536, 'F0908080'); // 0x00010000
$args[] = array(65535, 'EFBFBF' ); // 0x0000FFFF
$args[] = array(57344, 'EE8080' ); // 0x0000E000
$args[] = array(57343, false ); // 0x0000DFFF these are ill-formed
$args[] = array(56040, false ); // 0x0000DAE8 these are ill-formed
$args[] = array(55296, false ); // 0x0000D800 these are ill-formed
$args[] = array(55295, 'ED9FBF' ); // 0x0000D7FF
$args[] = array(53248, 'ED8080' ); // 0x0000D000
$args[] = array(53247, 'ECBFBF' ); // 0x0000CFFF
$args[] = array(4096, 'E18080' ); // 0x00001000
$args[] = array(4095, 'E0BFBF' ); // 0x00000FFF
$args[] = array(2048, 'E0A080' ); // 0x00000800
$args[] = array(2047, 'DFBF' ); // 0x000007FF
$args[] = array(128, 'C280' ); // 0x00000080
$args[] = array(127, '7F' ); // 0x0000007F
$args[] = array(0, '00' ); // 0x00000000
$args[] = array(20108, 'E4BA8C' ); // 0x00004E8C
$args[] = array(77, '4D' ); // 0x0000004D
$args[] = array(66306, 'F0908C82'); // 0x00010302
$args[] = array(1072, 'D0B0' ); // 0x00000430
foreach($args as $argList)
{
list($pass,$ret) = call_user_func_array('utfTest',$argList);
$results[$pass ? 'passed' : 'failed'][] = $ret;
}
if(count($results['failed']))
{
echo "One or more tests failed:\n";
echo implode("\n",$results['failed']);
}
else
{
echo "All tests passed.\n";
}
}
//testUTF();
echo '<pre>Before:
'.htmlentities(var_export($text,true)).'
</pre>';
$text = preg_replace_callback('/&#([0-9]+?);/','makeUTF8',$text);
echo '<pre>After:
'.htmlentities(var_export($text,true)).'
</pre>';
?>
- Pyrite
- Forum Regular
- Posts: 769
- Joined: Tue Sep 23, 2003 11:07 pm
- Location: The Republic of Texas
- Contact:
Hmmm, well I don't understand your code, but it doesn't work for me.
If I do this to populate my select (with his code) it works:
But with yours:
All I get is blank lines in the select. Hmmm.
If I do this to populate my select (with his code) it works:
Code: Select all
$i = 0;
while (!$step2->EOF) {
$lid = $step2->fields[0];
$lnm = utf8Encode($step2->fields[1]);
?>
document.forms['frmEnroll'].step2.options[<?=$i;?>] = new Option('<?=$lnm;?>','<?=$lid;?>');
<?php
$i++;
$step2->MoveNext();
}
Code: Select all
$i = 0;
while (!$step2->EOF) {
$lid = $step2->fields[0];
$lnm = MakeUTF8($step2->fields[1]);
?>
document.forms['frmEnroll'].step2.options[<?=$i;?>] = new Option('<?=$lnm;?>','<?=$lid;?>');
<?php
$i++;
$step2->MoveNext();
}