Page 1 of 1

Wierd fsockopen problem, help?

Posted: Thu Aug 28, 2008 7:56 am
by kaisellgren
Hello,

Whenever I try to fetch all content from a website using fsockopen I end up in wierd thing.

Here's sample code:

Code: Select all

$fp = fsockopen("www.google.fi",80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: http://www.google.fi\r\n";
    $out .= "Connection: Close\r\n\r\n";
 
    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp, 128);
    }
    fclose($fp);
}
For some reason this is the returned header
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Date: Thu, 28 Aug 2008 12:54:13 GMT
Expires: -1
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=254a8e3ff08e0e9f:TM=1219928053:LM=1219928053:S=gahRNMEyAykxQ5Xo; expires=Sat, 28-Aug-2010 12:54:13 GMT; path=/; domain=.google.fi
Server: gws
Transfer-Encoding: chunked
Connection: Close

6ef
Wtf is 6ef? Also the website HTML content ends with number 0 ALWAYS...

Another one:

Code: Select all

$fp = fsockopen("www.kaisellgren.name", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: http://www.kaisellgren.name\r\n";
    $out .= "Connection: Close\r\n\r\n";
 
    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp, 128);
    }
    fclose($fp);
}
Headers returned:
HTTP/1.1 200 OK
Date: Thu, 28 Aug 2008 12:55:08 GMT
Server: Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.7a DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/5.2.6
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

698
Wtf 698 ?

What are those wierd things? And the content ended with 0 again:
</body>
</html>
0
What's going on?

Re: Wierd fsockopen problem, help?

Posted: Thu Aug 28, 2008 8:48 am
by Ziq
Look at response headers:
HTTP/1.1 200 OK
Date: Thu, 28 Aug 2008 12:55:08 GMT
Server: Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.7a DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/5.2.6
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html
It's mean what data transfer to you by pieces. 698 - hex value first part in bytes. It's mean that first part hexdec(698) = 1688 chars (in ANSI charset). You should take into account this. This code help you understand

Code: Select all

 
<?
define('N', "\r\n");
 
function get_contetn($hostname, $path)
{
    $line = "";
   
    $fp = fsockopen($hostname, 80, $errno, $errstr, 30);
   
    if (!$fp) echo "$errstr ($errno) <br>\r\n";
    else
    {
        $headers = "GET $path HTTP/1.1".N;
        $headers .= "Host: $hostname".N;
        //  Mozilla Firefox 3
        $headers .= "User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/525.19 (KHTML, like Gecko) Version/3.1.2 Safari/525.21;".N;
        $headers .= "Connection: Close".N.N;
       
        fwrite($fp, $headers);
       
        while (!feof($fp))
        {
            $line .= fgets($fp, 1024);
        }
        fclose($fp);
    }
    return $line;
}
 
$content = get_contetn('www.google.fi', '/');
list($head, $body) = explode("\r\n\r\n", $content, 2);
 
if (strpos($head, 'Transfer-Encoding: chunked') !== false)
{
    //echo $body; without special chars
    while (!empty($body)) 
    {
        $counter = hexdec(substr($body, 0, strpos($body, N)));
        $body = substr($body, strpos($body, N) + 2);
        echo substr($body, 0, $counter);  //  echo 1 piece
        $body = substr($body, $counter);
    }
 
}
else
{
    echo $body;
}
?>
 
IMPORTANT: This code is fast written and not optimized or something else.

Also you can use cURL. It's more simply.

Maybe you'll be interested:
RFC2616 Hypertext Transfer Protocol -- HTTP/1.1