Code: Select all
,Code: Select all
and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read: [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]
I am having a problem with the "&" character entity when parsing an XML document with PHP.
I want the script to parse a document for two items, so they can be loaded into a MYSQL DB.
The problem is the document title is the name of a lawsuit, so it may (or may not) include the amp character entity. When the parser encounters this entity it messes up the parser. There may be an obvious solution, as I am not that familar with either character encoding or XML parsing.
Here is the code (I left out the startTag and endTag functions, but they really don't do anything):Code: Select all
function fetchData($file) {
// function to find needed XML data for index update
echo "<br /><font color='salmon'>Locating case data for $file</font>";
$xmlparser = xml_parser_create();
xml_set_element_handler($xmlparser, "startTag", "endTag");
xml_set_character_data_handler($xmlparser, "getcontents");
xml_parser_set_option($xmlparser, XML_OPTION_CASE_FOLDING, false);
$CURRENT = "";
global $casedata;
if (!($fp = fopen($file, "r"))) {
die("failed to open $file");
} else {
while ($data = fread($fp, 4096)){
$data=eregi_replace(">"."[[]]+"."<","><",$data);
if (!xml_parse($xmlparser, $data, feof($fp))) {
$reason = xml_error_string(xml_get_error_code($xmlparser));
$reason = xml_get_current_line_number($xmlparser);
die($reason);
}
}
fclose($fp);
}
xml_parser_free($xmlparser);
return ($casedata);
}// End function
function getcontents($parser, $data){
global $CURRENT;
switch ($CURRENT) {
case "id":
echo "<br />id is: ($data)";
break;
case "short":
echo "<br />name is: ($data)<br />";
break;
}
}// End functionThe result I get is something like:
Code: Select all
Locating case data for opinions/1978/F/006/1978-F006-04070001.xml
id is: (1978-F006-04070001)
name is: (A )
name is: (&)
name is: ( M Records, Inc. v. M.V.C. Distrib. Corp.)The XML:
Code: Select all
<short>A & M Records, Inc. v. M.V.C. Distrib. Corp.</short>feyd | Please use
Code: Select all
,Code: Select all
and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read: [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]