Screen scraper connectivity issue?
Posted: Thu Oct 28, 2010 4:53 pm
Ok, I am my wits end with this one... I have a simple little web scraper that I'm using to gather some nba boxscores. Very simple stuff it would seem. I have xampp installed on my computer and am simply trying to run this script:
Here's the thing... the script works fine if there are three or fewer boxscores to scrape. If there are more than three it sends me to a "connection reset page" in firefox. No php error log or anything. Sometimes it will even bypass the "connection reset page" and give me a blank white page with the http://localhost changed to http://www.localhost.com. Any ideas?<?php
include_once('/simple_html_dom.php');
set_time_limit(0);
ignore_user_abort(1);
ini_set('max_execution_time', 0);
$a = 0;
$htmlfirst = file_get_html('http://www.basketball-reference.com/box ... &year=2010');
$list = $htmlfirst->find('table[@id=games]',0)->find('a[href^=/boxscores/]');
$length = sizeof($list);
echo '<table>';
foreach($list as $lists) {
//get the boxscore links
$links = $htmlfirst->find('table[@id=games]',0)->find('a[href^=/boxscores/]', $a)->outertext;
$pattern = '/<a href="/';
$replace = 'http://www.basketball-reference.com';
//extract the address from the html link
$start = preg_replace($pattern, $replace, $links);
$pattern = '/">.*/';
$url = preg_replace($pattern, '', $start);
//increment $a early since it's not used again
$a++;
//boxscore link
$html = file_get_html($url);
//counters
$i = 0;
$k = 2;
echo '<tr>';
//begin the scrape
while ($i <=1) {
if($html->find('h1 a[href^=/teams/]', $i)){
$team = $html->find('h1 a[href^=/teams/]', $i)->plaintext;
echo '<td>'.$team.'</td>';
}
$i++;
}
echo '</tr>';}
echo '</table>';
?>