Page 1 of 1

Problem parsing RSS feed

Posted: Mon Sep 08, 2008 2:55 pm
by cooperwd
Hi there!

I'm having trouble parsing an RSS file. Right now, my code is returning every item in the feed. I need it to stop after three posts are parsed.

I used this as a starting point: http://www.sitepoint.com/article/php-xm ... g-rss-1-0/

Here's my RSS parser class:

Code: Select all

 
class RSSParser {
    var $insideitem = false;
    var $tag = "";
    var $title = "";
    var $description = "";
    var $link = "";
    
    function startElement($parser, $tagName, $attrs) {
        if ($this->insideitem) {
            $this->tag = $tagName;
        } elseif ($tagName == "ITEM") {
            $this->insideitem = true;
        }
    }
 
    function endElement($parser, $tagName) {
        if ($tagName == "ITEM") {
            printf("<a class='postheading href='%s'>%s</a>",trim($this->link),htmlspecialchars(trim($this->title)));
            printf("<a class='postsummary' href='%s'>%s</a>",trim($this->link),htmlspecialchars(trim($this->description)));                                                     
            $this->title = "";
            $this->description = "";
            $this->link = "";
            $this->insideitem = false;
        }
    }
 
    function characterData($parser, $data) {
        if ($this->insideitem) {
        switch ($this->tag) {
            case "TITLE":
            $this->title .= $data;
            break;
            case "DESCRIPTION":
            $this->description .= $data;
            break;
            case "LINK":
            $this->link .= $data;
            break;
        }
        }
    }
}
?>
 
And here's the code that's using the class:

Code: Select all

 
 
    $xml_parser = xml_parser_create();
            $rss_parser = new RSSParser();
            xml_set_object($xml_parser,&$rss_parser);
            xml_set_element_handler($xml_parser, "startElement", "endElement");
            xml_set_character_data_handler($xml_parser, "characterData");
            $fp = fopen("http://www.mysite.com/my-rss-feed","r")
                or die("Error reading RSS data.");
            while ($data = fread($fp, 4096))
            {
                xml_parse($xml_parser, $data, feof($fp))
                    or die(sprintf("XML error: %s at line %d",
                        xml_error_string(xml_get_error_code($xml_parser)),
                        xml_get_current_line_number($xml_parser)));
            }
            fclose($fp);
            xml_parser_free($xml_parser);
 
 
 


Thanks for any advice you can provide! All the best,
Dave

Re: Problem parsing RSS feed

Posted: Mon Sep 08, 2008 2:58 pm
by marcth
I'm not sure if I understand what you are trying to do. Wouldn't something like this work?

Code: Select all

 
$recordsParsed = 0;
while ($data = fread($fp, 4096) && $recordsParsed < 3) {
  // ....
  $recordsParsed++;
}
 

Re: Problem parsing RSS feed

Posted: Mon Sep 08, 2008 3:20 pm
by cooperwd
Thanks for the quick reply Marc!

I tried that approach (seems to make sense), but got some weird results. When I set it to <3, the while... loop iterates fine but nothing gets returned (not even an error). When I set it to <4 or anything higher than 4, I get "XML error: Empty document at line 1".

It seems the while... loop isn't just iterating through posts -- it's climbing through the whole xml file (as it should). So I need to figure out how to capture which post I'm on (not just which node of the XML file I'm on). Or something like that...

I've tried the PHP.net documentation, but it doesn't say anything beyond "A document may be parsed piece-wise by calling xml_parse() several times with new data, as long as the is_final parameter is set and TRUE when the last data is parsed." And I have not idea how to actually do that using my parser class.

I'm really stumped!
D