Page 1 of 1

XML problem

Posted: Tue Mar 06, 2007 11:44 am
by pedrokas
Hi

The objective of my script is to strip a xml file into a mysql table.
The script read the file, recognize the tags but when i populate the string come empty!

i think is not necessary to put all te code, the start and end elements are secundary (i think :? )

Code: Select all

(...)
function charData($parser, $data)
{
global $dentro, $title, $sint, $tail;
 if($dentro)
 {
   switch($tag)
  {
     case "HEADLINE":
       $title .= htmlspecialchars($data)
     case "LEADPAR":
       $sint .= htmlspecialchars($data)
     case "TAILPAR":
       $tail .= htmlspecialchars($data)
   }
 }
}
the variables are empty, all of them!
Now i know the reason, but i do not know how i can solve it.
Inside the tags in the xml file are another tag's who causes de emptyness of the strings.
<paragraph display="proporcional" truncation="none">
How can i get the information on the tag without this anoying tag??

tks

Posted: Tue Mar 06, 2007 12:05 pm
by Begby
A. Where does $tag come from? It doesn't look like your switch is going to work.

B. You don't have any 'break;' statements in your switch, that will cause it to always use every case ending with the last.

Posted: Tue Mar 06, 2007 12:28 pm
by pedrokas
Begby wrote:A. Where does $tag come from? It doesn't look like your switch is going to work.

B. You don't have any 'break;' statements in your switch, that will cause it to always use every case ending with the last.
error copying the script...

the script:

Code: Select all

<?

$file = "myfile.xml";
// registo de variaveis
$dentro= false;
$tag ="";
$title = "";
$lead ="";
$texto ="";

function startElement($parser, $tagName, $attrs)
	{
		global $dentro, $tag;
		if($insideitem)
		{
			$tag = $tagName;
		}
		elseif ($tagName == "ARTICLE")
		{
			$dentro= true;
		}
	}
	
function characterData($parser, $data)
	{
		global $dentro, $tag, $title, $lead, $texto, $string;
		if($dentro)
		{
$string .= $tag."|".$data;

	
			switch($tag)
			{
				case "HEADLINE":
					$title .= htmlspecialchars($data);
					break;
				case "LEADPARAGRAPH":
					$lead .= htmlspecialchars(trim($data));
					break;
				case "TAILPARAGRAPHS":
					$texto .= htmlspecialchars(trim($data));
					break;
			}
		}
	}

function endElement($parser, $tagName)
	{
		global $dentro, $tag, $title, $lead, $texto, $string;
		if ($tagName == "ARTICLE")
		{
			printf("<p><b>%s</b></p>%s", htmlspecialchars(trim($title)), trim($texto));
			printf("<p>%s</p>", htmlspecialchars(trim($lead)));

		        $title="";
			$lead="";
			$texto="";
			$insideitem= false;
		}		
	}
	
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");

xml_set_character_data_handler($xml_parser, "characterData");

$fp = fopen($file,"r") or die ("erro a ler RSS");
 
while ($data = fread($fp, 4096))
xml_parse($xml_parser, $data, feof($fp)) or die (sprintf("Erro XML: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser)));

fclose($fp);
xml_parser_free($xml_parser);



?>

I think now is complete! sorry

Posted: Tue Mar 06, 2007 10:19 pm
by volka
pedrokas wrote:case "HEADLINE":
$title .= htmlspecialchars($data);
break;
case "LEADPARAGRAPH":
$lead .= htmlspecialchars(trim($data));
break;
case "TAILPARAGRAPHS":
$texto .= htmlspecialchars(trim($data));
break;
[...]
$fp = fopen($file,"r") or die ("erro a ler RSS");
Doesn't look like rss to me.
Is there a good reason for using sax parser like xml_parser_create? Nothing against sax but dom is often easier.
e.g. php5's simplexml

Code: Select all

$doc = simplexml_load_file('http://www.nytimes.com/services/xml/rss/nyt/Europe.xml');

foreach($doc->channel->item as $item) {
	echo 'title: ', $item->title, "<br />\n";
}
That's all it takes to print the titles of the current rss elements of the nytimes (europe).
Are you bound to http://de2.php.net/xml ?
Which versions of php can you use and what extensions?