While I can no longer dispute that gZip isn't superior to custom compression, I have determined the following using
http://www.codeproject.com as a example (just HTML - View Source):
Normal: 46386 bytes
w3compiler: 44129 bytes
gZip: 12352 bytes
Both: 11699 bytes
Apparantly whitespace/formatting doesn't take up as much space as I first assumed, not sure why that didn't occur to me.
Anyways, from the above benchmarks I have determined that compressing source code first using a native HTML optimizer (
http://www.w3compiler.com) followed by gZip compression did in fact reduce the file even more.
The additional benefit is that:
a. People cannot *easily* steal source code ideas

b. The browser doesn't have to parse extraneous whitespace, which should have an effect on overall experience.
Using both methods I experienced a savings of: 75%
Using gZip method I experienced a savings of: 73%
Using w3cc method I experienced a savings of: 5%
So why is it that both methods independantly yield a greater sum than when used togather? gZip is slightly less effective when HTML has already been compressed. For the same reason you cannot infinitely compress and already compressed file. Compression like encryption introduce entropy, which makes compression more difficult or impossible. Compression looks for patterns and when data is encrypted or compressed those patterns are reduced or removed outright.
Incase anyone else was curious (not directed at feyd)
Anyways, the important question to ask now is:
Does the 5% savings in whitespace, which might make the browser render slightly faster, justify the costs of compressing the HTML source code in the first place? To determine that we would have to benchmark average rendering times of compressed and uncompressed HTML. Compare the time savings there in addition to 5% savings of bandwidth and profiling the code which executes the HTML compression.
If the rendering times were faster and the 5% bandwidth savings outweighed the HTML compression, then obviously having an Apache module which compressed HTML then gZipped it would make sense.
Of course the above is only done on a single file, so these averages could fluctuate greatly on varying web sites...suffice it to say, that it is likely easier to compress javascript/html/css on the server side before delivery. And have apache just deliver cached/compressed versions, rather than perform optimizations on HTML/JavaScript and CSS on each delivery. The problem arises when your HTML/CSS/Javascript is completely or partially generated from PHP...in which case the ease of using a Apache module might then make sense.
I'm too lazy to pursue anymore, so mod_deflate is likely the method I will continue to use...
Cheers
