programmatic login with curl

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
SweatCoder
Forum Newbie
Posts: 5
Joined: Fri Sep 02, 2011 10:32 pm

programmatic login with curl

Post by SweatCoder »

There's a local site where I sometimes buy and sell stuff. Kind of like ebay. I'm trying to log in programmatically so I can perform some account management programmatically. I turned on Fiddler and logged in like normal. I captured all the headers, content, etc....then carefully attempted to build those same headers and content in php and programmatically submit the form with my login credentials. Unfortunately it was an epic fail. The curl_exec sits and spins forever. Never returns. If I kill the browser window executing my php code (locally), the response flushes to a text file. When I examine the file, all I see is gobbldygook, like what you see when unprintable characters are printed.

Before submitting my usn/pwd via programmatic post, I curl to the login page and obtain a sessionid in the cookie header. I grab this and place it in the header going back to the server, which seems to me simulates what the server wants. Here's my function:

Code: Select all

function Login($email, $password, $sessionID){
	$url = "http://www.ksl.com/public/member/signin";
	$post_data["member[email]"] = $email;  
	$post_data["member[password]"] = $password;  
	
	$headerArray = array(
		"Accept: image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*",
        "Referer: http://www.ksl.com/public/member/signin",
		"Accept-Language: en-US",
		"User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2;)",
		"Content-Type: multipart/form-data; boundary=---------------------------7db29d24150a56",
		"Accept-Encoding: gzip, deflate",
		"Host: www.ksl.com",
		"Content-Length: 682",
		"Connection: Keep-Alive",
		"Pragma: no-cache",
		"Cookie: PHPSESSID=" . $sessionID . "; s_cc=true; s_sq=%5B%5BB%5D%5D; s_vi=[CS]v1|2730AD6F85011AD9-6000010FC023FD31[CE]; _chartbeat2=jsxshjtoqwh4xhbe"
    );

	//traverse array and prepare data for posting (key1=value1)  
	foreach ($post_data as $key => $value) {  
		$post_items[] = $key . '=' . $value;  
	}  

	//create the final string to be posted using implode()  
	$post_string = implode('&', $post_items);  

	//create cURL connection  
	$curl_connection = curl_init($url);  

	//set options  
	curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30);
	curl_setopt($curl_connection, CURLOPT_USERAGENT, $userAgent);
	curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true);
	curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 0);
	curl_setopt($curl_connection, CURLOPT_HTTPHEADER, $headerArray);
	curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string);

	//execute post  
	$result = curl_exec($curl_connection);

	WriteFile("KSL_login.txt",$result . "\n\rcurl_getinfo: " . curl_getinfo($curl_connection) . "\n\rError: " . curl_error($curl_connection));

	//close the connection  
	curl_close($curl_connection);  

	return $result;
}
I must be doing some things wrong. Any ideas?
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: programmatic login with curl

Post by Eric! »

Try adding this to your code so you can see what problems cURL is running into.

Code: Select all

$debug=fopen("debug.txt","w"); // this will give you insight into what cURL is doing
curl_setopt($curl_connection, CURLOPT_STDERR, $debug);
curl_setopt($curl_connection, CURLOPT_VERBOSE, TRUE);
also setup a cookie jar so you can see what's going in and out

Code: Select all

$cookies = 'cookiejar.txt';
curl_setopt($curl_connection, CURLOPT_COOKIEFILE, $cookies);
curl_setopt($curl_connection, CURLOPT_COOKIEJAR, $cookies);
You also might need to set FOLLOWLOCATION to true. Also I don't see where $userAgent is defined. And I would trim down the headers quite a bit while experimenting because curl handles a lot of that already. Try:

Code: Select all

        $headerArray = array(
                "Accept: image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*",
        "Referer: http://www.ksl.com/public/member/signin",
                "Accept-Language: en-US",
                "Accept-Encoding: gzip, deflate",
                "Host: http://www.ksl.com",
                "Connection: Keep-Alive",
                "Pragma: no-cache"
    );
SweatCoder
Forum Newbie
Posts: 5
Joined: Fri Sep 02, 2011 10:32 pm

Re: programmatic login with curl

Post by SweatCoder »

Eric,

Thanks for your reply to my post. Sorry it took so long for me to come back, but I never got an email stating someone had posted to my question, so I gave up for awhile.

Anyway I tried what you said. The debug file gets written but it's empty. No cookie file gets written at all. Still no luck.

I think I'm really close, because the curl all works and I get "screenscrape" page response back BUT the site is not logging me in. I'm missing something. If I wanted to pay someone $100 to get this script working for me, any ideas where I could go? It wouldn't be building something from scratch, just take what I've got and tweak it until it works.

Any ideas?

Thanks!
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: programmatic login with curl

Post by Eric! »

I've never seen a case where the file is created but nothing appears. Did you fclose() the file? Try changing the permissions on those two files. Make sure the verbose option is set to true. Without the debug output working it is very hard to see what is failing with curl. If you can't get it to work, repost the code you have now so we can see it.
Last edited by Eric! on Thu Oct 13, 2011 6:37 pm, edited 1 time in total.
SweatCoder
Forum Newbie
Posts: 5
Joined: Fri Sep 02, 2011 10:32 pm

Re: programmatic login with curl

Post by SweatCoder »

Permissions are wide open, still no dice; yet still the files get created, but empty. I want to make it clear that curl is working fine, and I'm getting back content from the site; it's just that i'm not getting successfully logged in, and that's what I'm trying to figure out.

Here is my full code (feel free to log in with my test credentials):

Code: Select all

<?php 
$url = "http://www.ksl.com/public/member/signin";
$usn = info@zerogravpro.com";
$pwd = "testuser1";

$sessionID = GetSessionID($url) . "<br><br>";
Login($url, $usn, $pwd, $sessionID);

echo download_page("http://www.ksl.com/index.php?nid=13", $sessionID);

function download_page($url2, $sessionID){         
	$debug=fopen("KSL_debug2.txt","w"); // this will give you insight into what cURL is doing
	$cookies = 'KSL_cookiejar.txt';

	$ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$url2);         
	curl_setopt($ch, CURLOPT_FAILONERROR,1);         
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);         
	curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);    
	curl_setopt($ch, CURLOPT_COOKIE,$sessionID);	
	curl_setopt($ch, CURLOPT_TIMEOUT, 15);   
	curl_setopt($curl_connection, CURLOPT_STDERR, $debug);
	curl_setopt($curl_connection, CURLOPT_VERBOSE, TRUE);
	curl_setopt($curl_connection, CURLOPT_COOKIEFILE, $cookies);
	curl_setopt($curl_connection, CURLOPT_COOKIEJAR, $cookies);

	$retValue = curl_exec($ch);                               
	curl_close($ch);         
	return $retValue; 
}  

function Login($url, $email, $password, $sessionID){
	$debug=fopen("KSL_debug1.txt","w"); // this will give you insight into what cURL is doing
	$cookies = 'KSL_cookiejar.txt';

	//set POST variables
	$post_data = array(
		'MAX_FILE_SIZE' => '50000000',
		'dado_form_3' => '1',
		'member[email]' => $email,
		'member[password]' => $password,
		'x' => '46',
		'y' => '8'
	);

	//url-ify the data for the POST
	foreach($post_data as $key=>$value) { 
		$fields_string .= $key.'='.$value.'&'; 
	}
	rtrim($fields_string,'&');

	//open connection
	$ch = curl_init();

	//set the url, number of POST vars, POST data
	curl_setopt($ch,CURLOPT_URL,$url);
	curl_setopt($ch,CURLOPT_POST,count($post_data));
	curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
	
	//remove below?
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);         
	curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
	curl_setopt($ch, CURLOPT_COOKIE,$sessionID);
	curl_setopt($curl_connection, CURLOPT_STDERR, $debug);
	curl_setopt($curl_connection, CURLOPT_VERBOSE, TRUE);
	curl_setopt($curl_connection, CURLOPT_COOKIEFILE, $cookies);
	curl_setopt($curl_connection, CURLOPT_COOKIEJAR, $cookies);
	
	//execute post
	$result = curl_exec($ch);

	//close connection
	curl_close($ch);

	return $result;
}


function GetSessionID($url){
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
	curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
	curl_setopt($ch, CURLOPT_HEADER, 1);
	$buffer = curl_exec($ch);
	$curl_info = curl_getinfo($ch);
	curl_close($ch);
	$header_size = $curl_info[header_size];
	$headers = substr($buffer, 0, $header_size);
	$startPos = strpos($headers, "PHPSESSID=") + 10;
	$endPos = strpos($headers, ";", $startPos);
	return substr($headers, $startPos, $endPos-$startPos);
}

function WriteFile($filename, $contents){
	$fh = fopen($filename, 'w') or die("can't open file");
	fwrite($fh, $contents);
	fclose($fh);
}

function GetRequestHeaders() {
	foreach($_SERVER as $h=>$v){
		if(ereg('HTTP_(.+)',$h,$hp)){
			$headers[$hp[1]]=$v;
		}
	}
	return $headers;
}

function OutputArrayValues($array){
	while (list($key, $value) = each($array))
	{
		if ("" != $value)
		{
			echo "$key = $value<br>";
		}
	}
}

?>
Last edited by Benjamin on Thu Oct 13, 2011 6:44 pm, edited 1 time in total.
Reason: Added [syntax=php|sql|css|javascript] and/or [text] tags.
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: programmatic login with curl

Post by Eric! »

You need to change $curl_connection to $ch to match your curl_init() handler.

You also need an fclose() when you're done.

Also you need to put the curl debug lines (fopen, debug, verbose, cookies, etc) into your other functions too. Your functions should share the same cookie file. The debug.txt can be a different filename so you can see what is going on each time you call curl.

FYI -- it is a lot easier to read code if you remember to use the

Code: Select all

 tags.
SweatCoder
Forum Newbie
Posts: 5
Joined: Fri Sep 02, 2011 10:32 pm

Re: programmatic login with curl

Post by SweatCoder »

Eric,

Sorry about my mistakes. Here's the contents of the cookiejar:

# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.

.ksl.com TRUE / FALSE 1319158753 PHPSESSID 3c43dd202150fde1c9ced95b986ded54


And here's the contents of KSL_debug1.txt:

* About to connect() to http://www.ksl.com port 80 (#0)
* Trying 64.147.130.20... * connected
* Connected to http://www.ksl.com (64.147.130.20) port 80 (#0)
> POST /public/member/signin HTTP/1.1
Host: http://www.ksl.com
Accept: */*
Cookie: bf87bea0a69386e6a75b7085a85b310e
Content-Length: 106
Content-Type: application/x-www-form-urlencoded

< HTTP/1.1 302 Found
< Date: Fri, 14 Oct 2011 00:59:13 GMT
< Server: Apache
* Added cookie PHPSESSID="3c43dd202150fde1c9ced95b986ded54" for domain ksl.com, path /, expire 1319158753
< Set-Cookie: PHPSESSID=3c43dd202150fde1c9ced95b986ded54; expires=Fri, 21 Oct 2011 00:59:13 GMT; path=/; domain=.ksl.com
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< X-CMS-Server: m20
< Location: /public/member/home
< X-Server: m20
< Content-Length: 0
< Content-Type: text/html
<
* Connection #0 to host http://www.ksl.com left intact
* Issue another request to this URL: 'http://www.ksl.com/public/member/home'
* Violate RFC 2616/10.3.3 and switch from POST to GET
* Re-using existing connection! (#0) with host http://www.ksl.com
* Connected to http://www.ksl.com (64.147.130.20) port 80 (#0)
> GET /public/member/home HTTP/1.1
Host: http://www.ksl.com
Accept: */*
Cookie: PHPSESSID=3c43dd202150fde1c9ced95b986ded54; bf87bea0a69386e6a75b7085a85b310e

< HTTP/1.1 200 OK
< Date: Fri, 14 Oct 2011 00:59:13 GMT
< Server: Apache
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-cache
< Pragma: no-cache
< X-CMS-Server: m20
< X-CMS-State: active
< X-CMS-Collection: slc
< X-CMS-CRMSet: ksl
< X-CMS-nid: 578
< X-CMS-sid: 3288922
< X-CMS-tid: 103
< X-CMS-stage: http://stage-v2.ksl.com
< X-CMS-live: http://www.ksl.com
< X-Server: m20
< Transfer-Encoding: chunked
< Content-Type: text/html
<
* Connection #0 to host http://www.ksl.com left intact
* Closing connection #0


And here's the contents of KSL_debug2.txt:

* About to connect() to http://www.ksl.com port 80 (#0)
* Trying 64.147.130.20... * connected
* Connected to http://www.ksl.com (64.147.130.20) port 80 (#0)
> GET /index.php?nid=13 HTTP/1.1
Host: http://www.ksl.com
Accept: */*
Cookie: PHPSESSID=3c43dd202150fde1c9ced95b986ded54; bf87bea0a69386e6a75b7085a85b310e

< HTTP/1.1 200 OK
< Date: Fri, 14 Oct 2011 00:59:14 GMT
< Server: Apache
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< X-CMS-Server: m28
< X-CMS-State: active
< X-CMS-Collection: slc
< X-CMS-CRMSet: ksl
< X-CMS-nid: 47
< X-CMS-sid: 65207
< X-CMS-tid: 103
< X-CMS-stage: http://stage-v2.ksl.com
< X-CMS-live: http://www.ksl.com
< X-Server: m28
< Transfer-Encoding: chunked
< Content-Type: text/html
<
* Connection #0 to host http://www.ksl.com left intact
* Closing connection #0
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: programmatic login with curl

Post by Eric! »

Sorry I don't have much time right now. It looks like you are ending up with two different sessionid's. I suggest you try it without trying to get the session id and forcing that. Just let curl handle the cookies and sessionid.

Also you seem to have a problem with posting the data to the login
[text]* Issue another request to this URL: 'http://www.ksl.com/public/member/home'
* Violate RFC 2616/10.3.3 and switch from POST to GET[/text]
This is because it is getting a 302 redirect. So try adding:
curl_setopt($ch, CURLOPT_POSTREDIR, 3);
after your
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);

Edit: SORRY! It appears php curl API doesn't have the CURLOPT_POSTREDIR option...so I'll have to search some on the internet for your solution. Anyway try fixing the sessionid stuff first. Probably the best thing is for you to figure out the real URL to submit the post data to and use that at your login URL. You can use a firefox extension like Tamper Data to see what, how, why and where form data is getting submitted, then duplicate that with your curl code.
SweatCoder
Forum Newbie
Posts: 5
Joined: Fri Sep 02, 2011 10:32 pm

Re: programmatic login with curl

Post by SweatCoder »

Eric, you're a genius. Removing the cookie/sessionid line caused everything to start working. I am mystified by this, because I didn't realize that curl is so proactive about keeping a session alive and exchanging the correct headers in order to do so. I assumed I had to tell it everything I wanted done.

Anyway, thanks!
Post Reply