Page 1 of 1

CURL / HTTPS Problem - I'm desperate at this point!

Posted: Wed Aug 16, 2006 5:27 pm
by Joe826
I've been having a very frustrating problem with one of my CURL scripts lately. It's supposed to login to one of my affiliate programs and download the stats for me. The script was working fine until DirectTrack made some changes and went to HTTPS, now it's only partially working. Basically, I can get the script to login and navigate around some pages, but when it comes to downloading the actual stats file (which is a simple GET request), the whole process just hangs, and eventually returns a blank page.

Please let me know what other information I can give to help you. Here is my script:

Code: Select all

$email_address = urlencode("xxx@xxxxxxxxx.com");
$password = "xxxxxxx";
$cookie_file_path = "cookie"; // cookie file (dont bother changing)
	
// 1 - Get the Cookies required to login from the welcome login page

        $LOGINURL = "http://www.xxxxxx.com/";
	$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
	$ch = curl_init(); 
 	curl_setopt($ch, CURLOPT_URL,$LOGINURL);
	curl_setopt($ch, CURLOPT_USERAGENT, $agent);
  	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
	curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
	curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
  	$result = curl_exec ($ch);
  	curl_close ($ch);

// 2 - Post Login Cookies and Login Information to Page

	$LOGINURL = "https://login.xxxxxx.com/login.html";
	$reffer = "http://www.xxxxxx.com";

	$ch = curl_init(); 
  	curl_setopt($ch, CURLOPT_URL,$LOGINURL);
	curl_setopt($ch, CURLOPT_USERAGENT, $agent);
  	curl_setopt($ch, CURLOPT_POST, 1); 
 	curl_setopt($ch, CURLOPT_POSTFIELDS, "DL_AUTH_USERNAME=$email_address&DL_AUTH_PASSWORD=$password"); // add POST fields
 	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
	curl_setopt($ch, CURLOPT_REFERER, $reffer);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
	curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
	curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
  	$result = curl_exec ($ch);
Ok, the script works fine up to this point. When I print these results, I get the page displayed just as I should. However, when I go to download the stats file using the following code, it hangs:

Code: Select all

// 4 - go to stats page

	$LOGINURL = "https://login.xxxxxx.com/publishers/monthly_affiliate_stats.html?program_id=0&affiliate_stats_start_month=08&affiliate_stats_start_day=01...";
	$reffer = "https://login.xxxxxx.com/partners/";

	$ch = curl_init(); 
  	curl_setopt($ch, CURLOPT_URL,$LOGINURL);
	curl_setopt($ch, CURLOPT_USERAGENT, $agent);
 	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
	curl_setopt($ch, CURLOPT_REFERER, $reffer);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
	curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
	curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
  	$result = curl_exec ($ch);
	
	print $result;
I've verified that I can login using Firefox and just copy and paste the above URL and the stats file downloads fine. I really don't know what to do here. Thanks!

Posted: Wed Aug 16, 2006 9:19 pm
by Buddha443556
Well, curl_error can be helpful. However, I only notice one curl_close()? Just need a couple more maybe.

PS: Welcome to phpDN!

Posted: Wed Aug 16, 2006 9:34 pm
by Joe826
Hey Buddha, thanks for the reply.

I tried adding in the appropriate number of curl_close's, but it didn't help.

curl_error gives me the following: "Connect failed; Operation now in progress", and that's after waiting about 5 minutes for the page to load.

Any other ideas?

Joe

Posted: Wed Aug 16, 2006 10:57 pm
by Stoker
browsing other protected things with curl under your login works? try browse 3-4 other pages in a row just to find out if it is the script stalling at some point from a curl/php/script bug or if it is just something about that particular url.

If it is that particular url, try doing it via command line and turn the return headers on, just to see if the server returns anything at all, it sound like it does since curl doesn't time out quicker, perhaps the headers would have a clue about why?

Posted: Wed Aug 16, 2006 11:13 pm
by Joe826
Yeah i'm sure it's just that page. I'm wondering if it has something to do with the fact that the stats are coming from DirectTrack, which is an offsite tracking entity, instead of coming directly from the host i'm connecting to. I know the stats are coming from off-site, but I can't really tell where, as the URL that's giving me problems seems to be an on-site URL. Anyway it's a very tricky problem. I'm not sure how to use CURL at the command line, but i'll look it up and get back to you with those headers.

Thanks for the help,
Joe

BTW: If it makes any difference, that URL is supposed to download into a CSV file. Since it's not acting very friendly, i'm just trying to get the raw output with the print statement.

Posted: Thu Aug 17, 2006 2:00 am
by Joe826
Ok, looks like when it's done from the command line it spits out these headers:

HTTP/1.1 301 Moved Permanently
Date: Thu, 17 Aug 2006 06:43:33 GMT
Server:
Location: http://login.xxxxxx.com/partners/monthl ... ...ownload
Content-Type: text/html; charset=iso-8859-1

I have it set to follow redirects, any idea what the problem could be? The other peculiar problem is that this URL is http, instead of https, but when the query is tried with the Location: URL, we get no headers at all.

feyd | shortened the url reference. Breaking page layout's not nice.

Posted: Thu Aug 17, 2006 2:46 pm
by Joe826
One more bump, in case anyone out there has an answer! I've tried going to the stats request page: http://login.xxxxxx.com/partners/monthl ... stats.html, and I get the same error. I'm not sure what is about that page. Could anyone recommend any other methods of acheiving this? I had heard that OpenSSL might do the trick, but I'm not familiar with it at all.

Thanks,
Joe