Page 1 of 1

curl cookie data missing

Posted: Fri Jun 17, 2011 12:10 am
by Eric!
Is there a way to take a closer look at the cookie data that curl is gathering? I'm using verbose to see the headers and the jarfile/cookiefile settings but it seems that curl is missing some cookie data.

For example when I submit a form and look at the data with FF and tamper data, it shows the cookie as:
[text]JSESSIONID=5773B26C263F8FD970E3CB1EE0489952; CASTGC=TGT-357-byQlDfXFUQdyHteWj5udStbFLSFFUCwud9spceumfUDrYtsSjl-cas; s_lv=1308262427562; s_nr=1308260076243; s_vnum=1310849735520%26vn%3D2; BIGipServerprovision_http_pool=2646695178.36895.0000; s_sq=%5B%5BB%5D%5D; s_evar4=Weekday; s_evar3=Thursday; s_evar2=6%3A00PM; s_cc=true
[/text]
However with curl, the same page cookie file only shows:
[text]domain.com FALSE /cas FALSE 0 JSESSIONID 7FE9E3BBF06D63E95652A22B19AD0EF1
domain.com FALSE / FALSE 0 BIGipServerprovision_http_pool 2025938186.36895.0000[/text]

Is there a way to see all the raw header data and try to parse some of this information? I don't see any javascript that could be adding to the cookie info which curl would miss. What else could I try?

Re: curl cookie data missing

Posted: Fri Jun 17, 2011 1:50 pm
by McInfo
The key:values you see in cURL's cookie jar/file are all the cookies that cURL knows about. The difference between what cURL knows and what Firefox knows is related to what resources each has accessed. While your Firefox has been sociable, the cURL script has focused on just one or two resources.

Re: curl cookie data missing

Posted: Sat Jun 18, 2011 10:47 am
by Eric!
Any recommendations of a technique to view the full headers or how to capture the full cookie data so I can feed them to cURL? All cookies have to be set via http headers right?

At least that is what I get from the rfc's:
http://tools.ietf.org/html/rfc2109
http://tools.ietf.org/html/rfc2965

Edit: I did come accross a javascript method using window.name: http://www.thomasfrank.se/sessionvars.html that sets session variables, but would tamper data pick this up as cookie data? I don't think so...?

Re: curl cookie data missing

Posted: Sat Jun 18, 2011 3:18 pm
by McInfo
Eric! wrote:Any recommendations of a technique to view the full headers or how to capture the full cookie data so I can feed them to cURL?
Use network-sniffing software like Wireshark to see the exact headers that cURL (or Firefox) is sending (unless the site uses HTTPS).

You can prime the Cookie header by setting the CURLOPT_COOKIE option or by having cURL maintain its cookie jar as it visits all of the locations you would normally visit when using a browser. Keep in mind that even one-pixel images trigger requests and some sites use them for tracking. If you want just the head (for the cookie) and not the body of a response, enable CURLOPT_NOBODY to send a HEAD request instead of GET or POST.

You can use Firefox to simulate cURL by disabling images and JavaScript before browsing the target site. Also clear your cookies to make it easier to identify when they are set. Of course, monitor the requests with Tamper Data. If the site works as intended under those conditions, it should be straightforward to navigate the site with cURL. If not, you will have to investigate further to determine what effect the images or scripts have.
Eric! wrote:All cookies have to be set via http headers right?
As far as I know, the Cookie header is the only standard mechanism for transmitting cookie data. However, cookie data can be manipulated by JavaScript. In that case, the browser will send the modified cookie with its next request.
Eric! wrote:I did come accross a javascript method using window.name: http://www.thomasfrank.se/sessionvars.html that sets session variables, but would tamper data pick this up as cookie data? I don't think so...?
The method of maintaining state whereby a string is stored in the window.name property bypasses cookies entirely, so I think it is unrelated to your original issue.

Re: curl cookie data missing

Posted: Sun Jun 19, 2011 2:58 pm
by Eric!
The packets are encrypted (https). Also it appears the javascript is messing with the cookie data somewhere during the login. I can't think of any way around that. Perhaps I could take a look at what the javascript is doing and simulate that in php including any ajax calls for data. I'll work on it for a while and see how stuck I get. Thanks for the ideas so far.

Re: curl cookie data missing

Posted: Sun Jun 19, 2011 7:37 pm
by Eric!
I found it difficult to access the cookie jar while a connection was open, even for reading. I'm not sure why this happens but no file can be found. So I used the CURLOPT_HEADER to get the headers and parse the tokens for building urls and priming the pseudo-javascript.

So I have it working to the point where I can log in. However when I make a query, which also seems to be picking up most of the cookie tokens I need, I get the following response which I don't understand:
[TEXT]< HTTP/1.1 200 OK
[junk snipped]
< Keep-Alive: timeout=40, max=100
< Connection: Keep-Alive
< Content-Type: text/html; charset=ISO-8859-1
* no chunk, no close, no size. Assume close to signal end[/TEXT]
I don't know what this means for a response. Is there still some security token missing and the server just ignoring the request?

Re: curl cookie data missing

Posted: Sun Jun 19, 2011 10:36 pm
by McInfo
The server responded with a 200 OK status, so it didn't ignore your request; but the absence of a blank line after the last header line and the presence of that "no...end" message seems to indicate that you received only part of the response before the connection timed out.

Can you post some code?

Re: curl cookie data missing

Posted: Mon Jun 20, 2011 9:18 am
by Eric!
I'm sorry the code is a bit of a mess as I have been hacking at it -- adding options at random while exploring curl's features. Maybe you can see something there that I am missing.

Some odd things are if the curl STDERR file is type "w" it seems to write over itself when it switches from http to https. Also I can't access the cookie file in any way while the connection is open, so I had to request the header and parse it to get at some of the data. And I removed the javascript stuff that fetchs a few tokens with ajax because that doesn't really matter for this problem. The very last call to getcurl is the one where I get no chunks or data in response. This site is pretty ajax heavy, so I'm not expecting much other than some html with code I have to parse to fetch more data. The site is also not well designed for mal-formed queries, but it usually sends some kind of html response.

There are a lot of 302 redirects that curl chases down fantastically. Perhaps the key to my problem is that I can't get curl to consistently switch between HEAD and GET requests. By the time curl closes $process there are 3 connections to shut down and maybe they all aren't configured the same way due to the redirects where it is opening other connections. I don't know.

I didn't post the debug.txt log file because it would take a while to sanitize, but I can if you want to see that too.

Code: Select all

$postdata = array(
    "username" => "username",
    "password" => "password",
    "rememberMe" => "true",
    "lt" => "e1s1",
    "_eventId" => "submit",
    "submit" => "Login"
);
$postfield = "";
foreach ($postdata as $key => $value) {
    $postfield.=$key . "=" . $value . "&";
}

unlink("debug.txt");  //remove previous debug file
$debug=fopen("debug.txt","a"); // has to be append or cURL will overwrite
$cookie_file = 'cookiejar.txt';
$process = curl_init();
clrcookie($process); // reset cookie session
$url = 'http://domain.com/service-portal';
$results=(getcurl($process,$url,1, $debug,$cookie_file,"1"));
//echo nl2br($results);
// get id from header
preg_match_all("/JSESSIONID=(.*?);/ms", $results, $match);
$ssid=trim($match[1][count($match)-1]); //get last or only id
//echo "<br>ssid=".$ssid."<br>";
$results=getcurl($process,"https://sub.domain.com/cas/login?service=http%3A%2F%2Fdomain.com%2Fservice-portal%2Fhome",1, $debug,$cookie_file,"1");
//echo nl2br($results);
//copy($cookie_file,'cookie.txt'); // tried to make local copy for access to cookies...not working

// REMOVED JAVASCRIPT stuff to generate parts of next URL to login
$url = "https://sub.domain.com/cas/login;jsessionid=" . $ssid . "?service=http%3A%2F%2Fdomain.com%2Fservice-portal%2Fhome";
$results=post($process,$url, $postfield, $debug,$cookie_file,"https://sub.domain.com/cas/login?service=http%3A%2F%2Fdomain.com%2Fservice-portal%2Fhome");

// LOGIN successful, fetch a page
$url="http://sub2.domain.com/sold/listing/cache/sb_sold_results.jsp?slim=sold&cit=true&sm=3&searchPage=%2Fsold%2Flisting%2Fcache%2Fsb_search_page2.jsp&robw=&is=&type=&fromSDint=&fromSDmon=1&fromSDyr=&toSDmon=12&toSDyr=&fromLength=35&toLength=35&luom=126&fromYear=2001&toYear=2006&fromPrice=&toPrice=&currencyid=100&hmid=0&ftid=0&enid=0&city=&cint=&spid=&rid=&ywbid=&ps=30&bn=&includeSoldComments=";
$results=(getcurl($process,$url,0, $debug,$cookie_file,"1"));
echo $results;

curl_close($process);

function clrcookie($process){
    curl_setopt($process,CURLOPT_COOKIESESSION,1);
}

function getcurl($process,$url, $head_only=1,$debug=NULL, $cookies=NULL, $refer="1") {
    $headers[] = "Accept[text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8]";
    $headers[] = "Accept-Charset[ISO-8859-1,utf-8;q=0.7,*;q=0.7]";
    $headers[] = "Keep-Alive[115]";
    $headers[] = "Connection: Keep-Alive";
    $headers[] = "Content-type: application/x-www-form-urlencoded";
    $user_agent = "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110422 Ubuntu/10.10 (maverick) Firefox/3.6.17";
    curl_setopt($process, CURLOPT_FOLLOWLOCATION, 20);
    curl_setopt($process, CURLOPT_NOBODY,1); // only headers
    if ($refer == "1")
        curl_setopt($process, CURLOPT_AUTOREFERER, 1);
    else
        curl_setopt($process, CURLOPT_REFERER, $refer);
    curl_setopt($process, CURLOPT_URL, $url);
    curl_setopt($process, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($process, CURLOPT_USERAGENT, $user_agent);
    if ($cookies != NULL) {
        curl_setopt($process, CURLOPT_COOKIEFILE, $cookies);
        curl_setopt($process, CURLOPT_COOKIEJAR, $cookies);
    }
    curl_setopt($process, CURLOPT_TIMEOUT, 30);
    curl_setopt($process, CURLOPT_RETURNTRANSFER, 1);
    if ($debug != NULL) {
        curl_setopt($process, CURLOPT_STDERR, $debug);
        curl_setopt($process, CURLOPT_VERBOSE, TRUE);
    }
    if($head_only==1) curl_setopt($process, CURLOPT_HEADER, true); // header will be at output
    else
        curl_setopt($process, CURLOPT_HEADER, false);
    $return = curl_exec($process);
    return $return;
}

function post($process,$url, $data, $debug=NULL,$cookies=NULL,$refer="1") {
    curl_setopt($process,CURLOPT_URL,$url);
    $headers[] = "Accept[text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8]";
    $headers[] = "Connection: Keep-Alive";
    $headers[] = "Content-type: application/x-www-form-urlencoded";
    $user_agent = "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110422 Ubuntu/10.10 (maverick) Firefox/3.6.17";
    curl_setopt($process, CURLOPT_HEADER, FALSE); // header will be at output
    curl_setopt($process, CURLOPT_FOLLOWLOCATION, 20);
    if ($refer == "1")
        curl_setopt($process, CURLOPT_AUTOREFERER, 1);
    else
    {
        curl_setopt($process, CURLOPT_AUTOREFERER, 0);
        curl_setopt($process, CURLOPT_REFERER, $refer);
    }
    curl_setopt($process, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($process, CURLOPT_USERAGENT, $user_agent);
    if ($cookies != NULL) {
        curl_setopt($process, CURLOPT_COOKIEFILE, $cookies);
        curl_setopt($process, CURLOPT_COOKIEJAR, $cookies);
    }
    curl_setopt($process, CURLOPT_TIMEOUT, 30);
    curl_setopt($process, CURLOPT_RETURNTRANSFER, 1);
    if ($debug != NULL) {
        curl_setopt($process, CURLOPT_STDERR, $debug);
        curl_setopt($process, CURLOPT_VERBOSE, TRUE);
    }
    $return = curl_exec($process);
    curl_setopt($process, CURLOPT_POSTFIELDS, $data);
    curl_setopt($process, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($process, CURLOPT_POST, 1);
    if ($debug != NULL) {
        curl_setopt($process, CURLOPT_STDERR, $debug);
        curl_setopt($process, CURLOPT_VERBOSE, 1);
    }
    $return = curl_exec($process);
    return $return;
}

Re: curl cookie data missing

Posted: Mon Jun 20, 2011 10:49 am
by McInfo
I'm still examining the code, but the syntax of some of the custom headers popped out at me. I'll use this one as an example because it is short:

Code: Select all

Keep-Alive[115]
The correct syntax is

Code: Select all

Keep-Alive: 115

Re: curl cookie data missing

Posted: Mon Jun 20, 2011 11:05 am
by Eric!
Yeah that's not right. I pulled bits from tamper data to try and make the headers match, I didn't notice that tamper data was changing the formatting for some reason. And since I didn't notice a difference in curl when I switched to using the tamper data header info, I didn't really continue with changing them in the post() function. But now that you mention it, the content-type should probably be changed too depending on the case. I'll have to look deeper into the headers this evening. Thanks!

Re: curl cookie data missing

Posted: Mon Jun 20, 2011 12:04 pm
by McInfo
Here, "jsessionid" seems out-of-place. It's not part of the query string.

Code: Select all

$url = "https://sub.domain.com/cas/login;jsessionid=" . $ssid . "?service=http%3A%2F%2Fdomain.com%2Fservice-portal%2Fhome";
When you need a placeholder domain, use domain.tld because domain.com is a real site.

Re: curl cookie data missing

Posted: Mon Jun 20, 2011 3:18 pm
by Weirdan
Eric! wrote: [TEXT]
HTTP/1.1 200 OK
[junk snipped]
< Keep-Alive: timeout=40, max=100
< Connection: Keep-Alive
< Content-Type: text/html; charset=ISO-8859-1
* no chunk, no close, no size. Assume close to signal end[/TEXT]
I don't know what this means for a response. Is there still some security token missing and the server just ignoring the request?
As far as curl sees the server is misbehaving. Keep-alive requires that it either send Content-length (size) header, or transfer data in Transfer-encoding: chunked.

You could try to disable keep-alive by taking out all keep-alive headers from your requests and adding Connection: close header.

Re: curl cookie data missing

Posted: Tue Jun 21, 2011 6:26 pm
by Eric!
@mcinfo -- yeah that's a weird URL structure, but that's what their server wants to see. I've never seen one with a semi-colon before either....

That's a good point about the keep-alive. I've never really seen this no chunk, type message before. I also know that this server chokes a lot if there are tokens out of place or if it gets confused on the url parameters. It could be that the server sends a 200 command but then fails to do anything.

I'll keep at it. Thanks for all the ideas. FYI I found this wiki page pretty handy about HTTP headers http://en.wikipedia.org/wiki/List_of_HTTP_header_fields

Eric

Re: curl cookie data missing

Posted: Wed Jun 22, 2011 7:28 am
by Eric!
Thanks for the ideas; I got it working. There were two problems. The header changes for content-type and connection: close removed the no chunk error. (Also I changed the formatting to the proper header format instead of what tamper data was generating).

Code: Select all

$headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Connection: close";
$headers[] = "Content-type: application/x-www-form-urlencoded, text/html";
$user_agent = "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110422 Ubuntu/10.10 (maverick) Firefox/3.6.17";
And with curl I couldn't get it switch from HEAD to GET requests until I added CURLOPT_HTTPGET to reset it using both commands:

Code: Select all

curl_setopt($process, CURLOPT_HEADER, false);
curl_setopt($process, CURLOPT_HTTPGET, true);// forces reset to GET from HEAD

Re: curl cookie data missing

Posted: Wed Jun 22, 2011 7:16 pm
by McInfo
It's good to know it's working finally. Thanks for the update.