Using loadHtmlFile

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
robbycraig
Forum Newbie
Posts: 2
Joined: Mon Oct 23, 2006 1:54 pm

Using loadHtmlFile

Post by robbycraig »

I am trying to use the loadHtmlFile command to do some work with RSS feeds but I have encountered some difficulties.

The following code takes a news feed and prints out the titles in a list.

Code: Select all

$dom = new domdocument; 

$url = 'http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/uk/rss.xml'; 

$dom->loadHTMLFile($url); 

echo '<h1>BBC UK News Headlines</h1>'; 
echo '<ul>'; 

$items = $dom->getElementsByTagName("item"); 
     
foreach($items as $item) { 
     
    $titles = $item->getElementsByTagName('title'); 
     
    foreach($titles as $title) 
    { 
        $titleText = $title->firstChild->data; 
    } 
     
    $links = $item->getElementsByTagName('link'); 
     
    foreach($links as $link) 
    { 
        $linkLoc = $link->firstChild->data; 
        echo $linkLoc; 
    } 
     
    echo '<li><a href="' . $linkLoc . '">'.$titleText.'</a></li>'; 

} 

echo '</ul><br/><br/>';
The problem is it will not pull out the links. I can pull out any of the other data from the feed apart from the links.
So it investigate further I used the following command near the top of my code:

Code: Select all

echo $dom->saveHtml();
When I look at the source of this the xml for the feed is intact except for one major error each of the links was missing the closing tag </link> hence me not being able to pull out the information.

Why is this? Is it something really simple that I am missing? I have tried this on 3 news feeds now yahoo, google and the BBC.

Any help will be greatly appreciated. Thanks in advance
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

Why do you use loadhtmlfile()? It's a valid xml file.
robbycraig
Forum Newbie
Posts: 2
Joined: Mon Oct 23, 2006 1:54 pm

Other suggestions?

Post by robbycraig »

I used loadHtmlFile as I was originally screen scraping.

What's a better way to go about this?
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

http://de2.php.net/manual/en/function.d ... t-load.php
DOMDocument->load() -- Load XML from a file
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

Post by hawleyjr »

curl() or file_get_contents() Depends on which vs. of PHP you are using :)
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

neither curl() nor file_get_contents() return a DOM resource ;)
Post Reply