Page 1 of 1

Extracting link title from malformed anchor tag

Posted: Mon May 09, 2005 12:19 pm
by carpeaqua
I am trying to create an RSS feed for my school's newspaper via the PHP5 DOM support. I main page from their website around 1am via curl and then run HTML tidy on it to clean it up.

Once clean, I use PHP to extract the href attribute as the <link> tag for the RSS feed. The problem I am having is setting a title. Unfortunately, no one on my school's paper knows what a "title" attribute is, so I can't use the same method I used for getting the URL.

how can i go about getting the value between the <a></a> tags? here is the relevant code I have so far.

Code: Select all

<?php
	foreach ($params as $param) {	
		if (!substr_compare($param -> getAttribute('href'), $strValue, 0, $strLength, true)) {
			echo "<item>\n";
			echo "\t\t\t<title>" . "no clue how to do this" . "</title>\n";
			echo "\t\t\t<link><![CDATA[[url]http://www.purdueexponent.com[/url]" . trim($param -> getAttribute('href')) . "]]></link>\n";
			echo "</item>\n";			
		}
	}
?>

Posted: Mon May 09, 2005 1:05 pm
by Burrito
you'll need a regex to do that and fortunately, D11 just wrote a great little tut on regex and I think there's even an example of exactly what you're looking for at the end.

here is the link:

viewtopic.php?t=33147

Burr