I have the following code stroed in a html file:
<div class="class1">
<a href="http://www.yahoo.com">Yahoo</a>
A search engine
<div class="category">search</div>
</div>
How can I use regular expression to extract the link 'www.yahoo.com', the name 'Yahoo' and the description and the category 'Search' into one array?
I know I should use the preg_match and {} but I just cant get this work..
Please help,
Many many thanks.
CF K
Regular Expression - How to extract html tags and info
Moderator: General Moderators
-
anthony88guy
- Forum Contributor
- Posts: 246
- Joined: Thu Jan 20, 2005 8:22 pm
Re: Regular Expression - How to extract html tags and info
simonk wrote:I have the following code stroed in a html file:
<div class="class1">
<a href="http://www.yahoo.com">Yahoo</a>
A search engine
<div class="category">search</div>
</div>
How can I use regular expression to extract the link 'www.yahoo.com', the name 'Yahoo' and the description and the category 'Search' into one array?
I know I should use the preg_match and {} but I just cant get this work..
Please help,
Many many thanks.
CF K
Code: Select all
$link = 'http://www.blahblah.com/blah.html';
$pagecontents = file_get_contents($link);
preg_match_all('#<a href="(.*)">(.*)</a>[\n\s]*A search engine[\n\s]*<div class="category">(.*)</div>#', $pagecontents, $match);
if($match)
{
echo 'Match1: ' . $match[1] . '<br>';
echo 'Match2: ' . $match[2] . '<br>';
echo 'Match3: ' . $match[3] . '<br>';
}else{
echo 'No matches...';
}