Obviously when outputting anything to the browser in HTML format, it should be properly escaped so that the page displays/validates correctly.
When outputting HTML I use the following function:
Code: Select all
function html_clean($var) {
return htmlspecialchars($var, ENT_COMPAT, 'UTF-8');
}However for our intranet I've finally got round to building some RSS feeds of various data.
What characters do I need to escape for XML?
Obviously the html special characters... & < > " and ' have to be done for XML as well.
And I'm under the impression that apostrophe is optional for escaping in XHTML. We make sure that all our tags/attributes use " rather than ' anyway.
But do I need to convert other characters eg: non English characters to numeric entities for XML (RSS) output?
Or can I just leave them as native UTF-8 as they come out of our database?
Cheers, B