Page 1 of 1

Copying data between anchor tags

Posted: Sun Aug 30, 2009 9:25 pm
by booted
here goes, I'm working on a project to scrape the data from a build page. On this build page contains data that I need for each build. I was thinking about using php to fopen web site and use preg_match_all to get the contents between 2 html anchors.

Here's a sample build page

Code: Select all

 
<html>
<body>
<a name="#bugs">Bug List</a>
<ul>
<li>Bug One</li>
<li>Bug Two</li>
</ul>
<br>
<a name="#notes">Bug Notes</a>
blah<br>
<ul>
<li>Bug Notes 1</li>
<li>Bug Notes 2</li>
</ul>
</body>
</html>
 
I tried using preg_match_all grab all the content between the unique anchors but had no luck.

Code: Select all

 
<?PHP
preg_match_all("/<a name=\"#bugs\">Bug List</a>(.+)<a name=\"#notes\">Bug Notes</a>/",$webpage, $results);
print_r($results);
?>
 
I ended up use this code to get data between <li> </li>

Code: Select all

 
<?PHP
preg_match_all("/<li>(.+)<\/li>/",$webpage, $results);
print_r($results);
?>
 
There are multiple <li> further down the page but I only need the data between "<a name="#bugs">Bug List</a>" and "<a name="#notes">Bug Notes</a>". If I use the 2nd snippet of code, it will grab all the <li>'s on the page.

Am I on the right direction?