Page 1 of 1

Using Curl to GRAB a https:// webpage, URGENT! Help PLZ

Posted: Sun Jan 08, 2006 10:47 pm
by jclarkkent2003

Code: Select all

function fetch($url, $post=false, $referer="")
	{
		$useragent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)";
	   	$ch = curl_init();
	   
	   	if($post)
	   	{
	   		curl_setopt($ch, CURLOPT_POST,1);
			curl_setopt ($ch, CURLOPT_POSTFIELDS, "$post"); 
	   	}
	   	
	   	//use proxy?
	   	if($this->useProxies && $this->CURRENT_PROXY)
	   	{
	   		curl_setopt ($ch, CURLOPT_PROXY, "http://".$this->CURRENT_PROXY);
	   	}	
	   	curl_setopt($ch, CURLOPT_URL,$url);
	   	curl_setopt($ch, CURLOPT_USERAGENT, 	$useragent);
	   	curl_setopt ($ch, CURLOPT_COOKIEJAR, "c:\cookie.txt");
	   	curl_setopt ($ch, CURLOPT_HEADER, 0);
	   	curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
	   	curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
	   	curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
	   	if(!empty($referer))
	   		curl_setopt($ch, CURLOPT_REFERER, $referer);
	   	$content = curl_exec($ch);
	   	curl_close ($ch);
	   	return $content;
	}

$input_file = fetch("https://secure.telewest.com.au/myacen/", true, "");

echo $input_file;
Why doesn't this work?

That's not the real https page I want to grab, but it's an example. I tried with file_get_contents since that's how I usually do my grabbing and reg expressions after, but it gave me errors on SSL or something for file_get_contents.

I need to grab all the source code of a webpage on https:// into ONE php string variable, such as $input_file, so I can then do my preg_match and validate what needs to be validated.

Any help is very much appreciated, I'm trying to do this as fast as possible for a friend and ran into that problem, right now the above script outputs nothing and I want it to output the real page (exampe setting up a web proxy is what this script may be used for, but a very simple one and easy to read, nothing like the open source ones out there that are over 20 kb in size and three dozen files). Help me please~!

Thanks in advance!!!!!

Posted: Sun Jan 08, 2006 11:07 pm
by feyd
can you verify that your server supports ssl as a file stream? phpinfo() will tell you about this under the label: Registered PHP Streams

Posted: Sun Jan 08, 2006 11:27 pm
by jclarkkent2003
Registered PHP Streams php, file, http, ftp, compress.zlib, https, ftps
Registered Stream Socket Transports tcp, udp, ssl, sslv3, sslv2, tls

Anything else you need?

Did you try the script or is their another one that works with ssl you advise i take, test and try to modify?

Posted: Mon Jan 09, 2006 12:08 am
by feyd
have you tried

Code: Select all

$content = file_get_contents('ssl://the.url.org');
note the use of ssl as the protocol.

Posted: Mon Jan 09, 2006 3:05 am
by onion2k
This isn't a particularly safe way of doing things.. but..
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0 );
.. switchs off SSL verification. It'll still be using SSL, just ignoring the check to see whether or not the certificate is ok.

Posted: Mon Jan 09, 2006 8:59 am
by jclarkkent2003
thanks feyd and onion2k, I knew I loved this place!!!

Points to ya onion2k, ur method worked, thanks, now I just have to perfect my function to pass the appropriate headers because I figured out this site is very specific on headers.

Didn't try
$content = file_get_contents('ssl://the.url.org');

but I'll try that too, thing is I really gotta pass the headers like I said, so I'd have to combine that with fsockopen or something probably.

BTW, Are there any other suggestions? Just inscase? heh....

Peace~!

Posted: Wed Jan 11, 2006 9:06 pm
by redmonkey
Personally I prefer to use...

Code: Select all

curl_setopt($ch, CURLOPT_CAINFO, '/path/to/ca_cert.crt');
... assumes that the server you are connecting to uses a certificate signed by a known and trusted CA.

If you already have the curl libraries installed on your system (the actual cURL library not the PHP cURL module) then you should find a CA cert file (ca-bundle.crt) within the lib directory. If not you can download the cURL package and get the file from there.