Page 1 of 1

XML Parsing - Tag contents are split at apostrophes

Posted: Wed Aug 15, 2007 8:50 am
by IMJackson
Hi,

I'm currently trying to write some code to parse an RSS news feed, using the standard XML Parser functions (the site I am working on must run on a PHP4 server, hence no SimpleXML, unfortunately).

I have got the parser and all its handlers essentially working. However, I have found that when the contents of a tag are passed to the character data handler, if there is an apostrophe then the contents will be split up, resulting in multiple calls to the handler. For example:

Code: Select all

<title>House plans &apos;will hit green belt&apos;</title>
If I output the individual pieces of data passed to the character data handler for the above snippet of XML to an array, the result is:

Code: Select all

[0] => House plans
[1] => '
[2] => will hit green belt
[3] => '
This also appears to occur with other special characters, such as quotes (") or pound signs (£). I want the entire contents of the tag to be passed to the handler in one go, not piece-by-piece as above. I've tried searching on Google but I can't find any explanation for this behaviour. Could someone please help?

Posted: Wed Aug 15, 2007 8:53 am
by feyd
Entities, those special marks, are separate pieces of data. They sometimes need to be handled separately and specifically. That is why you are getting it separated.