Page 1 of 1

Need help on scrapping script

Posted: Mon Aug 24, 2009 9:52 pm
by php_coder
<?php

session_start();

$_SESSION['number']=$_POST["number"];

$userAgent = "Googlebot/2.1 (+http://www.google.com/bot.html)";

$target_url="http://example.com/".$_SESSION['number'];

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookiefile");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookiefile"); # SAME cookiefile
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

$html= curl_exec($ch);
echo $html;
$dom = new DOMDocument();
@$dom->loadHTML($html);

$all_div=$dom->getElementsByTagName('div');

foreach($all_div as $div)
{
if($div->getAttribute('class')=="d1-2")
{
echo "<center>";
echo "Example doesn't exist";
echo "</center>";
}
else
if($div->getAttribute('class')=="uinf-2-2-4-2")
{
echo "<center>";
echo "Example exist";
echo "</center>";
}


}


?>

This script runs perfectly on the localhost, but when i run the script on the server(yahoo small business). It just echo's html page, it doesn't output example doesn't exist or example exist.

I think script time's out, not sure what the problem is. Any suggestions?

Re: Need help on scrapping script

Posted: Tue Aug 25, 2009 9:38 am
by Ambush Commander
You can find out if it's a timeout by adding the following lines to your code:

Code: Select all

error_reporting(E_ALL);
ini_set('display_errors', 1);

Re: Need help on scrapping script

Posted: Wed Aug 26, 2009 1:47 am
by php_coder
Ambush Commander wrote:You can find out if it's a timeout by adding the following lines to your code:

Code: Select all

error_reporting(E_ALL);
ini_set('display_errors', 1);

Thanks for the code, it was not exactly a time out problem. Problem was with the yahoo server, they only support php 4, it worked perfectly on php 5. I think php 4 doesn't support dom.

Re: Need help on scrapping script

Posted: Wed Aug 26, 2009 9:31 am
by Ambush Commander

Re: Need help on scrapping script

Posted: Wed Aug 26, 2009 10:50 pm
by php_coder
[quote="Ambush Commander"]Lies.

I never said, that i am sure about php 4 not supporting dom. I just said i think.

Re: Need help on scrapping script

Posted: Thu Aug 27, 2009 9:33 am
by Ambush Commander
It was meant in a humorous manner ;-) domxml does not actually work that well, and has given me loads of grief in the past.