Reading and converting a Diigo RSS feed

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
marbuy
Forum Newbie
Posts: 4
Joined: Sat May 08, 2010 11:48 am

Reading and converting a Diigo RSS feed

Post by marbuy »

I want to upload the contents of an RSS feed and convert it into a web page. In this case, it is a feed for a Diigo bookmarks list.

To do this, I use the DOM model and get the different elements of each entry using a getElementsByTagName struct.

This works fine. However, in this Diigo feed, the <description> tag contains three paragraph sections, delimited by <p> tags. For various reasons, I would like to handle the contents of each paragraph in a different way, so I want to retrieve each of these paragraphs separately. Unfortunately, for unknown reasons, these <p> tags are not really present like that in the feed. Instead, they are coded in an escaped form (<p>), which apparently makes that the getElementsByTagName does not work on these tags.

Any ideas for handling this (in an easy way)?
cpetercarter
Forum Contributor
Posts: 474
Joined: Sat Jul 25, 2009 2:00 am

Re: Reading and converting a Diigo RSS feed

Post by cpetercarter »

Um...decode them? Can you not use html_entity_decode, or even just str_replace, on the relevant bits of the rss feed ?
marbuy
Forum Newbie
Posts: 4
Joined: Sat May 08, 2010 11:48 am

Re: Reading and converting a Diigo RSS feed

Post by marbuy »

Thanks for the reply, but I do not think this solves the issue.

When the feed is loaded, it is loaded into a new DOMDocument. I am not a specialist in these things, but my understanding is that such DOMDocument not only contains the content of the feed, but also 'understands' the hierarchy of all the tags in it. Via various command, you can then navigate to the relevant tag and retrieve its content. Giving the fact that the <p> tags are not written like that at upload, these paragraphs are likely not present as individual elements in the stack, but just as one piece of content of the <description> tag.

So, what I hope to find is something that allows at load to change these tags. Otherwise, the only thing that can be done is text parsing to retrieve the content of each paragraph.

All of this is of course pure guessing, since I am not at all a PHP expert.
Post Reply