I need to loop through and xml file and remove curly quotes and em dashes and other invalid characters from our xml. Where can I find a list of these characters and their hex numbers and so forth. So I can replace the characters before they are fed to the outside world...
Any advice would be great...
Thanks
Dealing with curly quotes and em dashes
Moderator: General Moderators
http://www.lookuptables.com/
a handy resource for me that has been.
also try googling ascii character map you should.
a handy resource for me that has been.
also try googling ascii character map you should.
These seems to have done it for me:
Code: Select all
<?php
//Set the header
header('Content-type: text/xml');
//Get the name
$file= $_GET['file'];
$rootPath = '/home/somesite/www/';
//If the file is not set then set a default file to retrieve;
$xmlfile = ( is_file($rootPath.$file ) )? $file : 'rss.xml';
$xml = file_get_contents($rootPath.$xmlfile);
function xml_specialchars($string) {
$search = array(chr(145),
chr(146),
chr(147),
chr(148),
chr(151),
chr(169),
chr(174),
chr(150),
chr(153),
chr(149),
chr(183),
//chr(38),
chr(162),
chr(163),
chr(165));
/*$replace = array("'",
"'",
'"',
'"',
'-'); */
$replace = array('‘',
'’',
'“',
'”',
'—',
'©',
'®',
'$ndash;',
'™',
'•',
'·',
// '&',
'¢',
'£',
'€'
);
return str_replace($search, $replace, $string);
}
$clean_xml = xml_specialchars($xml);
echo $clean_xml;
?>