Page 1 of 1

Download Missing Last Chunk

Posted: Wed Feb 09, 2011 3:32 pm
by cesarcesar
I'm using the following script to download large file (>100Mb). It works well except that it seems to not ever save the final chunk. If the file is 149,499 on the server, it finishes its download at 145,996. Why? How do I get the last 2% or so to flush and complete the download? Thank much for your help.

FYI, this also happens the same on smaller files so its not stopping for time or file size issues.

Code: Select all

 
$path = "the/file/path.mp4";

$headsize = get_headers($path,1);

$ext = str_from_last_occurrence($_vars['filepath'],".");

if ($ext=="mp3") { $type = "audio"; }
elseif ($ext=="mp4") { $type = "video"; }

function readfile_chunked($filename,$retbytes=true) {

	// Stream file
	$handle = fopen($filename, 'rb');
	$chunksize = 1*(1024*1024); // how many bytes per chunk
	$buffer = '';
	$cnt =0;

   if ($handle === false) {
       return false;
   }

   while (!feof($handle)) {
       $buffer = fread($handle, $chunksize);
       echo $buffer;
       ob_flush();
       flush();
       if ($retbytes) {
           $cnt += strlen($buffer);
       }
   }

   $status = fclose($handle);

   if ($retbytes && $status) {
       return $cnt; // return num. bytes delivered like readfile() does.
   }

   return $status;

}

header('Cache-Control: no-cache, no-store, max-age=0, must-revalidate');
header("Content-type: ".$type."/".$ext);
header('Content-Length: ' . (string)($headsize['Content-Length']));
header('Content-Disposition: attachment; filename="'.str_from_last_occurrence($_vars['filepath'],"/").'"');
header("Content-Transfer-Encoding: binary");

readfile_chunked($path);

exit;

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 6:23 am
by Mordred
Lose the Content-length header (also there should be better ways of obtaining a file size, no? :) )

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 6:30 am
by Weirdan
Mordred wrote:(also there should be better ways of obtaining a file size, no? :) )
get_headers() kind of suggest the file is on a remote server.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 6:36 am
by cesarcesar
Weirdan wrote:
Mordred wrote:(also there should be better ways of obtaining a file size, no? :) )
get_headers() kind of suggest the file is on a remote server.
So do i keep it or loose it? Yes the file is on a remote CDN

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 7:07 am
by Mordred
Lose it, chunked encoding takes care of that. Btw are you 100% sure that the echo / ob_flush pattern forces chunked? I guess it should, but DOES IT?

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 7:21 am
by cesarcesar
Mordred wrote:Lose it, chunked encoding takes care of that. Btw are you 100% sure that the echo / ob_flush pattern forces chunked? I guess it should, but DOES IT?
When i remove it I loose the ability for the download engine to know how large the file is and how long it will take to download. The Loading bar just sits at 1% till it finishes. I think i need this there to appease my clients.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 7:44 am
by Mordred
When i remove it I loose the ability for the download engine to know how large the file is...
... on the upside you gain the ability to correctly transfer the content, right ;)

You can't chunk encode your cake and content-length it too, pick one.

If you also own/control your "loading bar", whatever that is, make it smarter so it works correctly.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 7:47 am
by cesarcesar
If you also own/control your "loading bar", whatever that is, make it smarter so it works correctly.
Its the standard browser download manager.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 7:49 am
by cesarcesar
You can't chunk encode your cake and content-length it too, pick one.
So what your saying is that I will be one of the only sites that downloads large files and doesn't supply the browser with proper content length info? just doesn't sound right.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 8:02 am
by Weirdan
Mordred, the file is not being sent as chunked because Transfer-encoding header is not set. Thus there's no reason to omit Content-length.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 8:07 am
by cesarcesar
Weirdan wrote:Mordred, the file is not being sent as chunked because Transfer-encoding header is not set. Thus there's no reason to omit Content-length.

Code: Select all

header("Content-Transfer-Encoding: binary");
Isn't this the Transfer-encoding header? Am i not using it properly?

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 8:20 am
by Mordred
Weirdan wrote:Mordred, the file is not being sent as chunked because Transfer-encoding header is not set. Thus there's no reason to omit Content-length.
As I said, I suspected that the ob_flush and/or a server setting (gzip?) forces chunked encoding behind the scenes. That's why I asked cesarcesar to check his "local" http response.

That said, running the numbers again, if we assume 150M content in 1M output chunks, this leads to 150*8 bytes = 1200 bytes loss, not the ~3M reported here (initially I misread the numbers as bytes, not kbytes, and didn't calc the chunked/non-chunked difference exactly)

So, examine the HTTP responses, both from the CDN and the hosting server.

Re: Download Missing Last Chunk

Posted: Thu Feb 10, 2011 8:23 am
by Weirdan
cesarcesar wrote:Isn't this the Transfer-encoding header? Am i not using it properly?
Content-Transfer-Encoding is not a part of HTTP standard: http://tools.ietf.org/html/rfc2616#section-19.4.5 , it's for MIME (email, for example) And in your script it wasn't set to chunked anyway.

The term 'Chunked encoding' is not the same as 'outputting data in chunks' - it is indeed special encoding and requires adding size information about every chunk into output stream (Read more: http://en.wikipedia.org/wiki/Chunked_transfer_encoding). Note that it specifically designed for cases when the size of response entity is unknown - if you use it you won't get proper progress indicator.
Mordred wrote: As I said, I suspected that the ob_flush and/or a server setting (gzip?) forces chunked encoding behind the scenes. That's why I asked cesarcesar to check his "local" http response.
Ah, now it's clear.