Apache Extensions

Need help installing PHP, configuring a script, or configuring a server? Then come on in and post your questions! We'll try to help the best we can!

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Apache Extensions

Post by alex.barylski »

So I'm sitting here thinking...you know what might make for a cool Apache extension...is a file compressor

Strip whitespace, comments, etc...

1) HTML
2) JavaScript
3) CSS

Saving you bandwidth galore...

Anyone know of such a Apache extension?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

You're aware we have a board pretty much dedicated to Apache, right?
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

feyd wrote:You're aware we have a board pretty much dedicated to Apache, right?
My bad, I posted this after reading the compiling PHP without thinking to check the location... :?

Know of any Apache extensions BTW? :P
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

You'd probably save more by turning on actual compression.

No, I don't know of an extension that does that, nor have I looked for one. I've built a few in PHP however.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

Edit Just read you built a few in PHP??? You wouldn't be a friend and email me it??? Although I'd prefer Apache C module, PHP might work for the time being...this way I could benchmark too to see if there is indeed an increase?????

Using both would likely yield the best results...

mod_deflate: Uses gzip which will compress any file (images, text, sound, etc) using a combo of compression routines which works best on file with uniform sequences of bytes.

The method I mention above, would likely work better because it is a specific compression technique which would only work on text files, but it has more thourough knowledge of what and what can't be removed.

Applying a source compression filter first then the mod_deflate after would likely yield the best results...

mod_deflate wouldn't compress text files *as* good as files full of white space...but I am sure some compression might come of it. Besides, the browser uses gzip and is returned a fully expanded (whitespace included) HTML document, which it then has to parse (whitespace included).

If it could ignore the whitespace, it would render even faster, or appear to, because of parsing times being reduced.

The benefits are there, it's just complicated. Could potentially screw up some oddly coded HTML, etc...

I'm going to investigate :P
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

gzip will yield far better results (smaller message length) a huge portion of the time in my working with them.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

While I can no longer dispute that gZip isn't superior to custom compression, I have determined the following using http://www.codeproject.com as a example (just HTML - View Source):

Normal: 46386 bytes

w3compiler: 44129 bytes
gZip: 12352 bytes


Both: 11699 bytes

Apparantly whitespace/formatting doesn't take up as much space as I first assumed, not sure why that didn't occur to me. :P

Anyways, from the above benchmarks I have determined that compressing source code first using a native HTML optimizer (http://www.w3compiler.com) followed by gZip compression did in fact reduce the file even more.

The additional benefit is that:
a. People cannot *easily* steal source code ideas :P
b. The browser doesn't have to parse extraneous whitespace, which should have an effect on overall experience.

Using both methods I experienced a savings of: 75%
Using gZip method I experienced a savings of: 73%
Using w3cc method I experienced a savings of: 5%

So why is it that both methods independantly yield a greater sum than when used togather? gZip is slightly less effective when HTML has already been compressed. For the same reason you cannot infinitely compress and already compressed file. Compression like encryption introduce entropy, which makes compression more difficult or impossible. Compression looks for patterns and when data is encrypted or compressed those patterns are reduced or removed outright.

Incase anyone else was curious (not directed at feyd) :)

Anyways, the important question to ask now is:

Does the 5% savings in whitespace, which might make the browser render slightly faster, justify the costs of compressing the HTML source code in the first place? To determine that we would have to benchmark average rendering times of compressed and uncompressed HTML. Compare the time savings there in addition to 5% savings of bandwidth and profiling the code which executes the HTML compression.

If the rendering times were faster and the 5% bandwidth savings outweighed the HTML compression, then obviously having an Apache module which compressed HTML then gZipped it would make sense.

Of course the above is only done on a single file, so these averages could fluctuate greatly on varying web sites...suffice it to say, that it is likely easier to compress javascript/html/css on the server side before delivery. And have apache just deliver cached/compressed versions, rather than perform optimizations on HTML/JavaScript and CSS on each delivery. The problem arises when your HTML/CSS/Javascript is completely or partially generated from PHP...in which case the ease of using a Apache module might then make sense.

I'm too lazy to pursue anymore, so mod_deflate is likely the method I will continue to use... :P

Cheers :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Something you have to take great care with in using a destructive compression such as removing whitespace is removing whitespace in places where whitespace is critical such as in <pre> and <tt> tags. Here's the catch: CSS can make almost any tag whitespace critical. Also take note that if you didn't write your Javascript carefully, i.e. no multiline strings, always using semicolons to end statements, etc.. the Javascript will break, horribly.
Post Reply