PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
I am trying to generate a rss feed and i use feed from nytimes. the feed has a got a few characters such as â that makes the xml document invalid. i do not know to clean it. i though if i create a text DOM element, it would automatically alter to make it valid but it did not do it. how to fix this?
<item>
<title>Lisbon Journal: A Song Form Is Updated, but Not in the Alleys of Its Originhttp://www.nytimes.com/2007/02/21/world/europe/21portugal.html?ex=1329714000&en=09de80043e8dc8cc&ei=5088&partner=rssnyt&emc=rss</title>
<link>http://www.nytimes.com/2007/02/21/world/europe/21portugal.html?ex=1329714000</link>
<description>The traditional music known as fado, which means fate, has been reinvented to become Portugalâ</description>
</item>
acirc is a named entity that has to be declared before it can be used in a xml document.
xhtml is implicitly attached to a dtd that imports the declaration of acirc
But rss is not xhtml and your rss parser does not import such a dtd and/or set of entities. You might change the xml document, making it import the entities.
was not he saying something like htmlentities are not supported in xml and we have to use numerical entities. when i googled, i found numerical entities are like in hex. feyd, are you saying html entities should work, i have not tried yet.
Just a thought, why not simply convert the offending tag's contents from a TEXT node to a CDATA node - after all, it's intended for xhtml rendering after all, isn't it?