Copying data between anchor tags
Posted: Sun Aug 30, 2009 9:25 pm
here goes, I'm working on a project to scrape the data from a build page. On this build page contains data that I need for each build. I was thinking about using php to fopen web site and use preg_match_all to get the contents between 2 html anchors.
Here's a sample build page
I tried using preg_match_all grab all the content between the unique anchors but had no luck.
I ended up use this code to get data between <li> </li>
There are multiple <li> further down the page but I only need the data between "<a name="#bugs">Bug List</a>" and "<a name="#notes">Bug Notes</a>". If I use the 2nd snippet of code, it will grab all the <li>'s on the page.
Am I on the right direction?
Here's a sample build page
Code: Select all
<html>
<body>
<a name="#bugs">Bug List</a>
<ul>
<li>Bug One</li>
<li>Bug Two</li>
</ul>
<br>
<a name="#notes">Bug Notes</a>
blah<br>
<ul>
<li>Bug Notes 1</li>
<li>Bug Notes 2</li>
</ul>
</body>
</html>
Code: Select all
<?PHP
preg_match_all("/<a name=\"#bugs\">Bug List</a>(.+)<a name=\"#notes\">Bug Notes</a>/",$webpage, $results);
print_r($results);
?>
Code: Select all
<?PHP
preg_match_all("/<li>(.+)<\/li>/",$webpage, $results);
print_r($results);
?>
Am I on the right direction?