Using Curl to GRAB a https:// webpage, URGENT! Help PLZ

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
jclarkkent2003
Forum Contributor
Posts: 123
Joined: Sat Dec 04, 2004 9:14 pm

Using Curl to GRAB a https:// webpage, URGENT! Help PLZ

Post by jclarkkent2003 »

Code: Select all

function fetch($url, $post=false, $referer="")
	{
		$useragent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)";
	   	$ch = curl_init();
	   
	   	if($post)
	   	{
	   		curl_setopt($ch, CURLOPT_POST,1);
			curl_setopt ($ch, CURLOPT_POSTFIELDS, "$post"); 
	   	}
	   	
	   	//use proxy?
	   	if($this->useProxies && $this->CURRENT_PROXY)
	   	{
	   		curl_setopt ($ch, CURLOPT_PROXY, "http://".$this->CURRENT_PROXY);
	   	}	
	   	curl_setopt($ch, CURLOPT_URL,$url);
	   	curl_setopt($ch, CURLOPT_USERAGENT, 	$useragent);
	   	curl_setopt ($ch, CURLOPT_COOKIEJAR, "c:\cookie.txt");
	   	curl_setopt ($ch, CURLOPT_HEADER, 0);
	   	curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
	   	curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
	   	curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
	   	if(!empty($referer))
	   		curl_setopt($ch, CURLOPT_REFERER, $referer);
	   	$content = curl_exec($ch);
	   	curl_close ($ch);
	   	return $content;
	}

$input_file = fetch("https://secure.telewest.com.au/myacen/", true, "");

echo $input_file;
Why doesn't this work?

That's not the real https page I want to grab, but it's an example. I tried with file_get_contents since that's how I usually do my grabbing and reg expressions after, but it gave me errors on SSL or something for file_get_contents.

I need to grab all the source code of a webpage on https:// into ONE php string variable, such as $input_file, so I can then do my preg_match and validate what needs to be validated.

Any help is very much appreciated, I'm trying to do this as fast as possible for a friend and ran into that problem, right now the above script outputs nothing and I want it to output the real page (exampe setting up a web proxy is what this script may be used for, but a very simple one and easy to read, nothing like the open source ones out there that are over 20 kb in size and three dozen files). Help me please~!

Thanks in advance!!!!!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

can you verify that your server supports ssl as a file stream? phpinfo() will tell you about this under the label: Registered PHP Streams
jclarkkent2003
Forum Contributor
Posts: 123
Joined: Sat Dec 04, 2004 9:14 pm

Post by jclarkkent2003 »

Registered PHP Streams php, file, http, ftp, compress.zlib, https, ftps
Registered Stream Socket Transports tcp, udp, ssl, sslv3, sslv2, tls

Anything else you need?

Did you try the script or is their another one that works with ssl you advise i take, test and try to modify?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

have you tried

Code: Select all

$content = file_get_contents('ssl://the.url.org');
note the use of ssl as the protocol.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

This isn't a particularly safe way of doing things.. but..
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0 );
.. switchs off SSL verification. It'll still be using SSL, just ignoring the check to see whether or not the certificate is ok.
jclarkkent2003
Forum Contributor
Posts: 123
Joined: Sat Dec 04, 2004 9:14 pm

Post by jclarkkent2003 »

thanks feyd and onion2k, I knew I loved this place!!!

Points to ya onion2k, ur method worked, thanks, now I just have to perfect my function to pass the appropriate headers because I figured out this site is very specific on headers.

Didn't try
$content = file_get_contents('ssl://the.url.org');

but I'll try that too, thing is I really gotta pass the headers like I said, so I'd have to combine that with fsockopen or something probably.

BTW, Are there any other suggestions? Just inscase? heh....

Peace~!
redmonkey
Forum Regular
Posts: 836
Joined: Thu Dec 18, 2003 3:58 pm

Post by redmonkey »

Personally I prefer to use...

Code: Select all

curl_setopt($ch, CURLOPT_CAINFO, '/path/to/ca_cert.crt');
... assumes that the server you are connecting to uses a certificate signed by a known and trusted CA.

If you already have the curl libraries installed on your system (the actual cURL library not the PHP cURL module) then you should find a CA cert file (ca-bundle.crt) within the lib directory. If not you can download the cURL package and get the file from there.
Post Reply