Summary of the problem:
I'm trying to find a way to convert extended characters (such as russian or japanese characters submitted via a GET submittede search form) into their corresponding HTML numeric entities.
Example:
If you enter into the search box
Россия
The get string sent is
?search=%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F
Within PHP it reports the string (after doing stripslashes, but that shouldn't make a difference) as:
Россия
I would like to convert that string to the respective HTML numeric entities, namely:
& #1056;& #1086;& #1089;& #1089;& #1080;& #1103;
(but without the spaces between the & and # - if I don't put the spaces in, this BB converts them to the actual letters)
...but I cannot figure out how.
Any help would be gratefully received!
Thanks,
C
PS. This BB seems to do exactly what I need when I post the message (hence the spaces in the entities above)!
Converting extended characters to html numeric entities
Moderator: General Moderators
You might be interested in http://de2.php.net/mbstring
after you have enabled the mbstring extension tryand take a look at your browser's source view.
That's the url encoded version of the utf-8 repesentation of your string.h4ppy wrote:The get string sent is
?search=%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F
after you have enabled the mbstring extension try
Code: Select all
<?php ini_set('default_charset', "UTF-8"); ?>
<html>
<body>
<?php
$s = join('', array_map('chr', array(0xD0, 0xA0, 0xD0, 0xBE, 0xD1, 0x81, 0xD1, 0x81, 0xD0, 0xB8, 0xD1, 0x8F)));
echo 'utf-8:' , $s, "<br />\n";
echo 'htmlentities: ', mb_convert_encoding($s, 'HTML-ENTITIES', 'UTF-8');
?>
</body>
</html>