Page 1 of 1
Regular Expression - How to extract html tags and info
Posted: Mon Apr 10, 2006 3:47 pm
by simonk
I have the following code stroed in a html file:
<div class="class1">
<a href="http://www.yahoo.com">Yahoo</a>
A search engine
<div class="category">search</div>
</div>
How can I use regular expression to extract the link '
www.yahoo.com', the name 'Yahoo' and the description and the category 'Search' into one array?
I know I should use the preg_match and {} but I just cant get this work..
Please help,
Many many thanks.
CF K
Re: Regular Expression - How to extract html tags and info
Posted: Mon Apr 10, 2006 4:01 pm
by anthony88guy
simonk wrote:I have the following code stroed in a html file:
<div class="class1">
<a href="http://www.yahoo.com">Yahoo</a>
A search engine
<div class="category">search</div>
</div>
How can I use regular expression to extract the link '
www.yahoo.com', the name 'Yahoo' and the description and the category 'Search' into one array?
I know I should use the preg_match and {} but I just cant get this work..
Please help,
Many many thanks.
CF K
Code: Select all
$link = 'http://www.blahblah.com/blah.html';
$pagecontents = file_get_contents($link);
preg_match_all('#<a href="(.*)">(.*)</a>[\n\s]*A search engine[\n\s]*<div class="category">(.*)</div>#', $pagecontents, $match);
if($match)
{
echo 'Match1: ' . $match[1] . '<br>';
echo 'Match2: ' . $match[2] . '<br>';
echo 'Match3: ' . $match[3] . '<br>';
}else{
echo 'No matches...';
}
Its not tested, probably has some errors, but that’s the just of it. BTW, I believe their is a forum specifically for regex.
Posted: Mon Apr 10, 2006 4:05 pm
by simonk
Thanks

but if i change the link, name and description into variable (unknown before the regex is run), how am I going to do it? the only thing i know is the <div class="xxx"></div>
I need to make an autmatic update page that captures informatoin within this div tag.
Thank you so much.