Page 1 of 1

CURL and Proxies

Posted: Tue Jan 18, 2011 12:30 pm
by J0kerz
Hey there,

I am currently using a PHP code with the CURL library to extract results from Google. As some of you may know, Google doesnt like to be scrapped and it's why I am using several private HTTP proxies to do it.

Here is the problem. After a while, the proxies get blocked by Google.

Here is what I did to found out the problem.

When I notice that a proxy get blocked by Google in my script, I immediately go to Google manually logged in with the proxy, and strangely I am not blocked at all.

Here is my simple CURL code:

Code: Select all

	                                           $ch = curl_init();
						curl_setopt($ch, CURLOPT_URL, 'GOOGLE QUERY HERE');
						curl_setopt($ch, CURLOPT_POST, 0);
						curl_setopt($ch, CURLOPT_USERAGENT, $user_agent); //$user_agent is randomly selected from a list wich contain the most popular user agent						
						curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
						curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
						curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt");
						curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt");
						curl_setopt($ch, CURLOPT_COOKIESESSION, true);  
						curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
						curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
						curl_setopt($ch, CURLOPT_PROXY, $proxies); //$proxies is randomly selected from my proxies list						
$source = curl_exec($ch);
IS there anything wrong in my code that could produce footprint/create undesirable cookies, etc..??

The thing that I really dont understand is why does Google block me when I am accessing his website using a script and not when I acces it manually even if I am sending the SAME query?

Re: CURL and Proxies

Posted: Thu Jan 20, 2011 3:45 pm
by John Cartwright
Locked.

Helping someone knowledgeably violating a TOS is against our rules here.