I am running into problems with special characters. I know XML data can't contain ampersands and the less than sign. I had problems when I just ran the data through htmlentities(). My copyright symbol would get written out as © , but then I get PHP errors when I try to read the file back in. Running the data through htmlentities() twice seems to work well. Then I end up with © which I think gets turned into © when the XML is read into the SimpleXML object.
Am I doing this right, or is there a better way? It even seems to work right if I put in an ampersand as input, surprisingly.
I also came across this function on the php.net comments:
Code: Select all
function htmlnumericentities($str){
return preg_replace('/[^!-%\x27-;=?-~ ]/e', '"&#".ord("$0").chr(59)', $str);
};But when I load that up in my edit page, I end up with a funny character (Â) just before the copyright symbol. If I manually edit the xml file and change it to © then it works OK. Man this is confusing! Maybe I should just put everything into CDATA! But that makes for an ugly file when you have to hand edit it.
Thanks for any help.