Screen scraping weather data
Posted: Sat Feb 09, 2008 11:23 pm
Hi,
I'm new to this board and PHP but finding both are very cool !
I have started to learn PHP by modifiying a screen scraper I found to pull my local weather from bom.gov.au weather feed (http://www.bom.gov.au/catalogue/data-feeds.shtml).
/*
Note they are OK with screen scraping as long as you cache and show where the data is coming from and its not resold.
*/
So here is the data feed:
ftp://ftp2.bom.gov.au/anon/gen/fwo/IDA00100.html
Data feed provides the following table:
Forecast for Monday
Sydney 24° Fine.
Melbourne 27° Fine.
Brisbane 28° A shower or two
Perth 34° Fine.
Adelaide 29° Fine. Sunny.
Hobart 25° Fine.
Canberra 25° Fine, partly cloudy.
Darwin 30° Monsoonal showers.
What I'm trying to do is to allow the code to pull the forecast from a given location such as "Sydney" then print "24° Fine".
However what I'm getting is the response repeating itself to the length of the table:
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
What I'm after is just:
Sydney 24° Fine.
I understand looping however having trouble applying it to this code maybe because I'm new so be kind if its a simple to fix it.
Here is the code
<?php
$url = "ftp://ftp2.bom.gov.au/anon/gen/fwo/IDA00100.html";
$raw = file_get_contents($url);
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($raw));
$start = strpos($content,'<table cellpadding="2" class="standard_table"');
$end = strpos($content,'</table>',$start) + 8;
$table = substr($content,$start,$end-$start);
preg_match_all("|<tr(.*)</tr>|U",$table,$rows);
foreach ($rows[0] as $row){
if ((strpos($row,'<th')===false)){
preg_match_all("|<td(.*)</td>|U",$row,$cells);
$state = strip_tags($cells[0][0]);
$temp = strip_tags($cells[0][1]);
$forcast = strip_tags($cells[0][2]);
// echo "$state - $temp - $forcast <br>\n";
print($rows[0][0]);
print "<br />";
}
}
?>
Thanks in advance
Mark
I'm new to this board and PHP but finding both are very cool !
I have started to learn PHP by modifiying a screen scraper I found to pull my local weather from bom.gov.au weather feed (http://www.bom.gov.au/catalogue/data-feeds.shtml).
/*
Note they are OK with screen scraping as long as you cache and show where the data is coming from and its not resold.
*/
So here is the data feed:
ftp://ftp2.bom.gov.au/anon/gen/fwo/IDA00100.html
Data feed provides the following table:
Forecast for Monday
Sydney 24° Fine.
Melbourne 27° Fine.
Brisbane 28° A shower or two
Perth 34° Fine.
Adelaide 29° Fine. Sunny.
Hobart 25° Fine.
Canberra 25° Fine, partly cloudy.
Darwin 30° Monsoonal showers.
What I'm trying to do is to allow the code to pull the forecast from a given location such as "Sydney" then print "24° Fine".
However what I'm getting is the response repeating itself to the length of the table:
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
Sydney 24° Fine.
What I'm after is just:
Sydney 24° Fine.
I understand looping however having trouble applying it to this code maybe because I'm new so be kind if its a simple to fix it.
Here is the code
<?php
$url = "ftp://ftp2.bom.gov.au/anon/gen/fwo/IDA00100.html";
$raw = file_get_contents($url);
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($raw));
$start = strpos($content,'<table cellpadding="2" class="standard_table"');
$end = strpos($content,'</table>',$start) + 8;
$table = substr($content,$start,$end-$start);
preg_match_all("|<tr(.*)</tr>|U",$table,$rows);
foreach ($rows[0] as $row){
if ((strpos($row,'<th')===false)){
preg_match_all("|<td(.*)</td>|U",$row,$cells);
$state = strip_tags($cells[0][0]);
$temp = strip_tags($cells[0][1]);
$forcast = strip_tags($cells[0][2]);
// echo "$state - $temp - $forcast <br>\n";
print($rows[0][0]);
print "<br />";
}
}
?>
Thanks in advance
Mark