RegEx in PHP Assistance
Posted: Thu May 06, 2010 2:24 pm
I've been studying regular expression for the past two days before actually looking for a community to seek assistance and I simply need help with this.
I have a simple table that I've scraped from a webpage that displays 10 records at a time. Only certain records display an image of a small football to the left of the displayed name and it is only those records that I need the script to display after processing each records name into a database.
The required data sits between the following html where "data I need is here" is written:
[text]//class="smallimage" alt="Football" title="Football"> data I need is here <img src="/images/football.gif[/text]
and this is the pattern I've become so frustrated with:
I sure would appreciate it if someone could show me how to properly write this. The full script is below. There is no DB processing yet as I want to focus on the pattern. Thanks.
I have a simple table that I've scraped from a webpage that displays 10 records at a time. Only certain records display an image of a small football to the left of the displayed name and it is only those records that I need the script to display after processing each records name into a database.
The required data sits between the following html where "data I need is here" is written:
[text]//class="smallimage" alt="Football" title="Football"> data I need is here <img src="/images/football.gif[/text]
and this is the pattern I've become so frustrated with:
Code: Select all
$pattern = '@title=["\']Football["\'][^>]*>([^>]+)<img [^>]*src=["\'][^/]/images[^/]/football.gif["\']@i';Code: Select all
<?php
function GetBetween($content,$start,$end){
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
//class="smallimage" alt="Football" title="Football">, <img src="/images/football.gif
$scrape_table = GetBetween(file_get_contents('http://www.somepage.com'), '<tbody>', '</tbody>');
$pattern = '@title=["\']Football["\'][^>]*>([^>]+)<img [^>]*src=["\'][^/]/images[^/]/football.gif["\']@i';
preg_match_all($pattern, $scrape_table, $matches);
$result = preg_match_all($pattern, $scrape_table, $matches);
for ( $counter = 0; $counter <= $result; $counter += 1)
{
echo $matches[1][$counter];
echo "<br />";
}
?>