Inspecting headers debug GET requests (linux/FreeBSD/Apache)

Need help installing PHP, configuring a script, or configuring a server? Then come on in and post your questions! We'll try to help the best we can!

Moderator: General Moderators

Post Reply
User avatar
Heavy
Forum Contributor
Posts: 478
Joined: Sun Sep 22, 2002 7:36 am
Location: Viksjöfors, Hälsingland, Sweden
Contact:

Inspecting headers debug GET requests (linux/FreeBSD/Apache)

Post by Heavy »

I tried to get as much info into the subject line as possible.

I have commissioned a new FAMP server (running FreeBSD, otherwise it would have been LAMP).

I have troubles with Internet Explorer receiving something that confuses it.

I do this:

Code: Select all

header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
$File = $objDB->query_get_first_row("select [... ...]");
if ($File !== false){

   header("Content-Length: ".$File['Size']);
   header("Content-type: ".$File['MimeType']);

   if ($BrowserIsGecko){
      header("Content-Disposition: attachment; filename=\"".(str_replace(Array("Å","å","Ä","ä","Ö","ö"),Array("A","a","A","a","O","o")));
   }else{
      header("Content-Disposition: attachment; filename=\"".$File['FileName']."\"");
   }

   if ($ArrContent = $objDB->query_get_rows("select Content from FileContent where FileID={$File['ID']} order by ID")){
      foreach ($ArrContent as $Chunk)
         echo $Chunk['Content'];
   }
}
The application has been running fine for a year on a debian box.
I have tested the very same code and database on two other boxes running gentoo linux and Apache2, php4. The FreeBSD server (the one with the problem) is running apache1 with php4.

Sympthom:
Whenever I , or anyone of our students, use Internet Explorer to download files from the db as described above, Internet explorer fails do find the PHP script. It bails with the error (loosely translated from swedish): "Could not open file-download.php?ID=1280
This URL can not be opened. The location is either unavailable or I cannot find it. Please try again later."

Thing is, of course everything works as expected using firefox or any other normal browser. <insert wild,intense,detailed and extensive rant, deserved from years of intellectual torture :evil: :evil:>

I have isolated the problem to occur only when my app runs on our primary web site. I try to discover the headers sent by apache to see the differences and maybe patch a workaround or change the apache config so the problem disappears, but I can only do that with normal files using an extension to firefox called "web developer", where I click "Information->View response headers".

From a working system I get:

Code: Select all

Date: Wed, 12 Jan 2005 09:01:27 GMT
Server: Apache/2.0.52 (Gentoo/Linux) PHP/4.3.10
X-Powered-By: PHP/4.3.10
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Last-Modified: Wed, 12 Jan 2005 09:01:27 GMT
Content-Length: 344
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Content-Type: text/html
and from the faulty one:

Code: Select all

Date: Wed, 12 Jan 2005 10:46:39 GMT
Server: Apache/1.3.33 (Unix) PHP/4.3.9
X-Powered-By: PHP/4.3.9
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Last-Modified: Wed, 12 Jan 2005 10:46:39 GMT
Keep-Alive: timeout=15, max=93
Connection: Keep-Alive
Transfer-Encoding: chun ked
Content-Type: text/html
NOTE: Any time I post the word "chun ked" above not splitted, I get an internal server error. I mean to post it NOT splitted, but it fails.

Now, I didn't explicitly add the "Transfer-Encoding" header, and I don't see it in the httpd.conf file. (It doesn't have includes either). I don't know if this is what causes Internet Explorer to fail, but I would like to be able to catch the headers output from the script above, so I can debug that as well. The firefox extension I used to catch the above info doesn't operate on specific URLs but on the page already loaded, it seems. Thus, cannot read extract the headers sent in the file download case.
I tried with telnet, which was a funny experience:

Code: Select all

jonas@jw jonas $ telnet hostname.com 80
Trying 211.xxx.xxx.xxx...
Connected to hostname.com.
Escape character is '^]'.
GET http://hostname.com/file-download.php?ID=42020

<html><head><link rel="StyleSheet" href="main.css" type="text/css"></head>
<body><table height="100%" align="center"><tr><td valign="middle" align="center">
an error occured. File not found in database.<br><br><a href="overview.php?WhatNow=View">Back</a>
</td></tr></table></body></html>
Connection closed by foreign host.
jonas@jw jonas $
In this example, I chose to try an ID I know doesn't exist, which is well and good, because otherwise I wouldn't get such a short nice output. What's bad is that it doesn't show me the headers.

So I lack knowledge on how to debug my case. Does anyone know how to easily extract the response HTTP-headers from a GET-request with common linux tools?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

here's a headers/source/preview browser I've been working on on and off. It requires curl, and it's nearly bug free, that I can tell. There are some minor bugs in the handling of certain url's stored in the html response, but most that I've used it on work well. Basically, it's a browser that instead of rrendering the html like you'd normally do, it shows the source code as sent. The headers are shown above. And I've started work on a preview below the code. Hope this helps.

Code: Select all

<?php

    ini_set( 'display_errors', '1' );
    error_reporting( E_ALL );
    
    if( !defined( '_DEBUG_' ) ) define( '_DEBUG_', 0 );
    
    if(!empty($_GET['url']))
    {
        $get = trim($_GET['url']);
        if(empty($get))
            $_GET['url'] = '';
        elseif( !preg_match('#^[a-zA-Z]{3,}://#',$_GET['url']))
            $_GET['url'] = 'http://' . $_GET['url'];
    }


    $output = "<html>\n\t<head>\n\t\t<title>{PAGE_TITLE}</title>\n\t</head>\n\t<body>\n\t\t<div align=\"center\"><form><input type=\"text\" name=\"url\" value=\"".(isset($_GET['url'])?$_GET['url']:'')."\" size=\"50\"><input type=\"submit\" value=\" get \"></form><div>";
    
    if( !empty($_GET['inline']) && !empty( $_GET['url'] ) && ( $data = @getimagesize( $_GET[ 'url' ] ) ) !== false )
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_NOBODY, 1);
        curl_setopt($ch, CURLOPT_URL, $_GET['url']);
        curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
        
        $header = '<fieldset><legend style="font-family:sans-serif">&nbsp;Headers&nbsp;</legend><pre style="text-align:left">%s</pre></fieldset>';
        $arr = preg_split("#\n#sS",$raw = curl_exec($ch));
        for($x = 0, $y = sizeof($arr); $x < $y; $x++)
        {
            $arr[$x] = rtrim($arr[$x]);
            if(!empty($arr[$x]) && !isset($found))
                $headers[] = $arr[$x];
            elseif(!empty($arr[$x]))
                $data[] = $arr[$x];
            else
                $found = $x;
        }
        $headers = sprintf($header,htmlentities(implode("\n",$headers),ENT_QUOTES));
        $output .= $headers.'<fieldset><legend style="font-family:sans-serif">&nbsp;Image&nbsp;</legend><img src="' . $_GET['url'] . '" /></fieldset>' . "\n\t\t";
        $output = str_replace('{PAGE_TITLE}', $data['mime'] . ' :: ' . $_GET['url'], $output);
    }
    elseif(!empty($_GET['inline']))
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_URL, $_GET['url']);
        curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    }
    elseif(!empty($_GET['url']))
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_URL, $_GET['url']);
        curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);

        //$data = file_get_contents( $_GET[ 'url' ] );
        $header = '<fieldset><legend style="font-family:sans-serif">&nbsp;Headers&nbsp;</legend><pre style="text-align:left">%s</pre></fieldset>';
        $arr = preg_split("#\n#sS",$raw = curl_exec($ch));
        for($x = 0, $y = sizeof($arr); $x < $y; $x++)
        {
            $arr[$x] = rtrim($arr[$x]);
            if(!empty($arr[$x]) && !isset($found))
                $headers[] = $arr[$x];
            elseif(!empty($arr[$x]))
                $data[] = $arr[$x];
            else
                $found = $x;
        }
        $headers = sprintf($header,htmlentities(implode("\n",$headers),ENT_QUOTES));
        $data = implode("\n",$data);
        curl_close($ch);
        if(preg_match('#<\s*title.*?>(.*?)<\s*/\s*title.*?>#is',$data,$title))
        {
            $output = str_replace('{PAGE_TITLE}', $_GET['url'] . ' :: ' . $title[1], $output);
        }
        else
        {
            $output = str_replace('{PAGE_TITLE}', 'No page title', $output);
        }
        $urls = array( 'href', 'src', 'action', 'background' );    //    resolve these attributes from the text
        
        $urls = implode( '|', $urls );
        preg_match_all( '#\s+?(' . $urls . ')\s*?=\s*?([\'"]?)(.*?)\\2[\s\>]#is', $data, $matches );
        
        $data = htmlentities( $data, ENT_QUOTES );
        
        $site = $_GET[ 'url' ];
        $bits = parse_url( $site );
        $root = $bits[ 'scheme' ] . '://' .
            ( isset( $bits[ 'user' ] ) ? $bits[ 'user' ] : '' ) .
            ( isset( $bits[ 'password' ] ) ? ':' . $bits[ 'password' ] : '' ) .
            ( isset( $bits[ 'user' ] ) ? '@' : '' ) .
            $bits[ 'host' ] .
            ( isset( $bits[ 'port' ] ) ? ':' . $bits[ 'port' ] : '' );
        $path = (isset($bits[ 'path' ]) ? explode( '/', $bits[ 'path' ] ) : array());
        array_pop( $path );
        $path = $root . implode( '/', $path ) . '/';
        
        if( _DEBUG_ )
        $output .= '<div align="left"><pre>' . var_export($matches, true) . '</pre></div>';
            
        
        $pos = 0;
        foreach($matches[0] as $match)
        {
            $url = $matches[ 3 ][ $pos ];
            if(!empty($url))
            {
                list( $left, $right ) = explode( $url, $match );

                $left = htmlentities( $left, ENT_QUOTES );
                $right = htmlentities( $right, ENT_QUOTES );

                //echo $left . $right . "<br />";

                if( preg_match( '#://#', $url ) )
                {    //    full url
                    $newurl = $url;
                }
                elseif( $url{0} == '/' )
                {    //    literal
                    $newurl = $root . $url;
                }
                elseif( preg_match( '#^[a-z0-9_-]+:#i', $url ) )
                {
                    $pos++;
                    continue;
                }
                else
                {
                    $newurl = $path . $url;
                }

                $data = preg_replace( '^' . preg_quote( htmlentities( $match, ENT_QUOTES ), '^' ) .'^', $left . '<a href="' . $_SERVER['SCRIPT_NAME'] . '?url=' . urlencode( $newurl ) .'">' . $url . '</a>' . $right, $data, 1 );
            }
            
            $pos++;
        }
        
        $output .= $headers.'<fieldset><legend style="font-family:sans-serif">&nbsp;HTML&nbsp;</legend><pre style="text-align:left">' . $data . '</pre></fieldset>' . "\n\t\t";
        $output .= '<fieldset><legend style="font-family:sans-serif">&nbsp;Page&nbsp;</legend><iframe src="' . $_SERVER['SCRIPT_NAME'] . '?inline=1&url=' . urlencode($url) . '" width="100%" height="100%">iframe required</iframe></fieldset>';
    }
    else
        $output = str_replace('{PAGE_TITLE}' , 'Welcome to feyd\'s Page Source Browser', $output);

    $output .= "</div>\n\t</body>\n</html>";
    
    echo $output;

?>
[edit]Fixed a bug introduced by unwinding from previous versions of the php highlighter.
Last edited by feyd on Tue Feb 14, 2006 11:09 pm, edited 2 times in total.
User avatar
Heavy
Forum Contributor
Posts: 478
Joined: Sun Sep 22, 2002 7:36 am
Location: Viksjöfors, Hälsingland, Sweden
Contact:

Post by Heavy »

Now THAT IS how a reply should read! :D
Big thanks!
I didn't myself even get the idea!

I'll need to rebuild PHP with curl, but thats a breeze with my gentoo laptop.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

cheers. :)
hokiebear
Forum Newbie
Posts: 1
Joined: Sun Jan 30, 2005 3:45 pm

Post by hokiebear »

Feyd, I tried your script and it worked like a charm! Wow. I am wondering how I can periodically retrieve data from a website and store it unto an excel spreadsheet.

Thanks.
User avatar
Heavy
Forum Contributor
Posts: 478
Joined: Sun Sep 22, 2002 7:36 am
Location: Viksjöfors, Hälsingland, Sweden
Contact:

Post by Heavy »

hokiebear wrote:Feyd, I tried your script and it worked like a charm! Wow. I am wondering how I can periodically retrieve data from a website and store it unto an excel spreadsheet.

Thanks.
Well...
step one:
retrieve data

step two:
store it in a format that excel can understand

step three:
set up a periodical fetching scheme.

:wink:

When asking questions that sound like "Please provide me an application", you are unlikely to get much help.
So, please, do as much as you can completely by yourself and ask well thought of questions whenever you get stuck. But make sure you search for existing solutions/answers first and link to any examples within your request for help.

Good luck.
Post Reply