Page 1 of 1

Archive (.zip .tar) extraction security

Posted: Thu Apr 29, 2010 12:56 pm
by AlexC
Morning,

As part of a media module I'm tweaking, which allows you to display images like a gallery for example - I'm adding the ability to upload multiple images via a .zip or .tar archive. However, I'm wondering how I can handle the security of this. I recall reading a while back that you can quite easily create a very small .tar or .zip archive, but when extracted it can be quite huge and fill up disk space.

Is there anyway to see what the extracted file size will be first? How is best to handle this situation?

Regads,

Re: Archive (.zip .tar) extraction security

Posted: Thu Apr 29, 2010 3:23 pm
by John Cartwright
I've never heard of this behavior. Compression size is usually only about 30% smaller than the uncompressed size.

Re: Archive (.zip .tar) extraction security

Posted: Thu Apr 29, 2010 3:28 pm
by Benjamin
There is a way to create massive differences between compressed and uncompressed file-sizes. It's fairly easy to do, so I won't say how. You may want to review the compression libraries for information on how to obtain the uncompressed file sizes. I'm sure this information is stored in the archive itself some place.

Re: Archive (.zip .tar) extraction security

Posted: Thu Apr 29, 2010 6:56 pm
by Weirdan
John Cartwright wrote:I've never heard of this behavior.
You never head of Zip of Death? I'm surprised.
I once had a multi-layer zip file 42 k in size which, when recursively unpacked, would take >4 petabytes.

Re: Archive (.zip .tar) extraction security

Posted: Fri Apr 30, 2010 3:17 am
by AlexC
John Cartwright wrote:I've never heard of this behavior. Compression size is usually only about 30% smaller than the uncompressed size.

Code: Select all

$ yes a | head -c 20485760 > 20MiB
$ ls -la 20MiB && gzip 20MiB && ls -la 20MiB.gz 
-rw-r--r-- 1 acartwright acartwright 20485760 2010-04-30 08:59 20MiB
-rw-r--r-- 1 acartwright acartwright 19920 2010-04-30 08:59 20MiB.gz
20MiB file compressed to 20KiB.
You may want to review the compression libraries for information on how to obtain the uncompressed file sizes. I'm sure this information is stored in the archive itself some place.
Yep, found it - not sure how I missed it before but gzip has a '-l' flag to list the compression ratio/sizes etc. Obviously 'tar' has no compression, so it should be the same size (I got a little muddled up before and had it in my head 'tar' could differer).

Code: Select all

$ gzip -l 20mb.gz 
         compressed        uncompressed  ratio uncompressed_name
              19920            20485760  99.9% 20mb
Should be able to work with this. Thanks.

Re: Archive (.zip .tar) extraction security

Posted: Fri Apr 30, 2010 10:35 pm
by John Cartwright
Weirdan wrote:
John Cartwright wrote:I've never heard of this behavior.
You never head of Zip of Death? I'm surprised.
I once had a multi-layer zip file 42 k in size which, when recursively unpacked, would take >4 petabytes.
Looks like I need to do my research before commenting :banghead:

Re: Archive (.zip .tar) extraction security

Posted: Sun May 09, 2010 2:56 am
by kaisellgren
Apart from compression technologies, NTFS Sparse files are also interesting. It's an easy way to fake DC++ kind of file sharing systems that require certain amount of GBs of share.

Anyway, if you are using PHP's Zip library, make sure you are using PHP 5.2.7 or newer, or else you will be vulnerable to directory traversal attacks. Then, make sure you read the uncompressed size of the Zip archive. Zip archives store that detail, but I do not know about Tar though. I bet it stores the detail, too.