Page 1 of 1

cURL and cookies - weird behaviour

Posted: Tue Nov 01, 2005 2:14 pm
by Ree
I'm in need of help trying to figure out what is causing certain behaviour I get when using cURL. First, the code I am using:

Code: Select all

function getData()
  {
    if ($this->type == 'POST')
    {      
      $this->login();
      $data_pack = $this->prepareVars($this->vars);
      $curl = &new cURLManager($this->URL);
      $curl->setOption(CURLOPT_HEADER, true);
      $curl->setOption(CURLOPT_POST, true);
      $curl->setOption(CURLOPT_POSTFIELDS, $data_pack);
      $curl->setOption(CURLOPT_RETURNTRANSFER, true);
      $curl->setOption(CURLOPT_COOKIEFILE, 'cookie.txt');
      $curl->setOption(CURLOPT_COOKIEJAR, 'cookie.txt');

      $curl->setOption(CURLOPT_SSL_VERIFYPEER, false);
      $curl->setOption(CURLOPT_SSL_VERIFYHOST, false);

      $curl->setOption(CURLOPT_REFERER, 'https://www.site.com/search.aspx');
      $curl->setOption(CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
      return $curl->getOutput();
    }
  }

  function login()
  {
    $vars = array('...');

    $data_pack = $this->prepareVars($vars);

    $curl = &new cURLManager('https://www.site.com/login.aspx?ReturnUrl=search.aspx');
    $curl->setOption(CURLOPT_HEADER, true);
    $curl->setOption(CURLOPT_POST, true);
    $curl->setOption(CURLOPT_POSTFIELDS, $data_pack);
    $curl->setOption(CURLOPT_RETURNTRANSFER, true);

    $curl->setOption(CURLOPT_COOKIEFILE, 'cookie.txt');
    $curl->setOption(CURLOPT_COOKIEJAR, 'cookie.txt');

    $curl->setOption(CURLOPT_SSL_VERIFYPEER, false);
    $curl->setOption(CURLOPT_SSL_VERIFYHOST, false);

    $curl->setOption(CURLOPT_FOLLOWLOCATION, true);
    $curl->setOption(CURLOPT_REFERER, 'https://www.site.com/login.aspx');
    $curl->setOption(CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
    $data = $curl->getOutput();
  }
These are two methods of the Request class I use to get HTML data from the site. This site requires a login to search some data on it. I send all the POST vars to search.aspx page which, if everything's ok, returns results that I need and use. For now, as you can see, I login ($this->login()) each time before POSTing data to search.aspx. After successfull login, the site redirects to search.aspx page (note the ReturnUrl var in the URL) which by default displays a search form (cURL is set to follow this redirect and it does).

I'll explain the behaviour I am getting. Let's say, I have no cookie.txt file (fresh request). After running my script (calling the getData() method) for the first time, the cookie.txt is created but I do not get any search results (only the form). Just a note, I DO know that the POST vars I send are correct as well as that login does not fail. After running the script the second time (with cookie.txt already present), I DO get the search results. I really don't get why this works after the second time only. Since the cookie is already present, I recieve whatever search results I need without any problem with each next request.

You could say, hey you got the cookie and it works when it's there, so what's the prob? The problem is, that the login session expires after 20 mins of inactivity, so, if I let the cookie rot without doing any search requests for 20 mins, I get no search results again. So I need to be able to re-login and get the search results again whenever I want, which, as I described, does not work with my code (it only works after running it twice).

When writing the code, this is the process I imagined and wanted to follow:

1. cURL logs into login.aspx and gets a fresh cookie.txt
2. It then POSTs my vars to search.aspx together with sending cookie.txt to the server and as a result gets the search results.

But for some reason it doesn't work correctly as I explained above. Do you guys have some hints regarding the problem? What should I try/correct? Any help would be nice.

Posted: Tue Nov 01, 2005 2:19 pm
by redmonkey
I'd start by looking at the headers being returned from the site you are accessing.

Posted: Tue Nov 01, 2005 3:39 pm
by Ree
Might be a good idea, have thought something up, but first one little question: what should I put into CURLOPT_COOKIE? Let's say, I recieve this header in one page and want to use it with CURLOPT_COOKIE:

Set-Cookie: Default-Search-Area=PROXIMITY; expires=Wed, 01-Nov-2006 21:00:15 GMT; path=/

Which part of it should go to curl_setopt()?

Posted: Tue Nov 01, 2005 3:42 pm
by feyd
none of it.. it falls into the cookie jar...

Posted: Tue Nov 01, 2005 3:43 pm
by Ree
I'll try not to use the cookie jar. Will try setting and sending cookies manually.

Posted: Tue Nov 01, 2005 4:05 pm
by feyd
a browser only sends the name-value pair, nothing more. Be careful to filter out cookies that should be deleted (expired or empty value)

Posted: Tue Nov 01, 2005 4:18 pm
by Ree
It is curl_setopt($handle, CURLOPT_COOKIE, 'name=value') then, right?

Btw, these are the headers I recieve.


Clean (first) request (without cookies.txt present):


Headers extracted from data recieved via login() method:
-------------------------------
HTTP/1.1 302 Found
Date: Tue, 01 Nov 2005 21:46:50 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Location: /aucsearchbyveh.aspx
Set-Cookie: ASP.NET_SessionId=ncuorhau3f0wg555izgdab55; path=/
Set-Cookie: aai=1BA7445D7EFAE1A1E5B146439944A839BB81932B3B52EBF9E989010534E0E76ACA9458D41609CB56FCC34C72867ED1515B1581739723BAE58F2DF152ED9AFF97F9D68ED806F67377; path=/
Set-Cookie: aai=E46EAA8DC4AC2E3D596552488F5158A78506D4454EEBC014D3619848B0D2D5C68125349B0A8BF90586ECA719BC0D62EE30004D0BA07D1B40C158A8835B652C0966756466B85CE774; path=/
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 33094

HTTP/1.1 200 OK
Date: Tue, 01 Nov 2005 21:46:52 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 34505
-------------------------------


Headers extracted from data recieved via getData() method:
-------------------------------
HTTP/1.1 200 OK
Date: Tue, 01 Nov 2005 21:52:37 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 35384
-------------------------------


Second request (cookies.txt already present after the first request):


Headers extracted from data recieved via login() method:
-------------------------------
HTTP/1.1 302 Found
Date: Tue, 01 Nov 2005 21:50:06 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Location: /default.aspx
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 130

HTTP/1.1 200 OK
Date: Tue, 01 Nov 2005 21:50:06 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 22560
-------------------------------


Headers extracted from data recieved via getData() method:
-------------------------------
HTTP/1.1 200 OK
Date: Tue, 01 Nov 2005 21:53:28 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Set-Cookie: Default-Zip-Code=10000; expires=Wed, 01-Nov-2006 21:53:26 GMT; path=/
Set-Cookie: Default-Search-Radius=Nationwide; expires=Wed, 01-Nov-2006 21:53:26 GMT; path=/
Set-Cookie: Default-Search-Area=PROXIMITY; expires=Wed, 01-Nov-2006 21:53:26 GMT; path=/
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 87346
-------------------------------

Cookies are the interesting part here, as you can see they're different for both requests...