File Generating CMS
Posted: Thu Mar 01, 2007 10:07 pm
I'm a big fan of static HTML files. They're simple, self-contained, easily editable, i.e. don't have very much baggage associated with them. You can easily version control them with SVN, you have complete control over their structure, and they're fast.
But they obviously have their limitations: no dynamic content, no templates, no guards to make sure you write standards-compliant code, no syntax highlighting, etc. To combat this problems, most CMSes out there have taken to redirecting all requests through PHP files, which generate the HTML on the fly and serve it to the browser. If you're lucky, you'll have a single PHP front-controller where all the requests get redirected.
I would like to propose a different approach: use Apache and htaccess as your front controller, and have it serve static HTML files, calling a PHP file to generate the HTML from a source XHTML file whenever some parameter is met.
Example:
You have demo.xhtml, which is your source file. It is a well-formed XHTML document, probably importing a few other namespaces which our post-processor will handle. Let's say it contains some text, uses <q> tags for quotations, contains the website's common header, and has some programming code.
When Apache receives a request for http://www.example.com/demo.html, a file which does not currently exist. So, Apache forwards the request along to main.php, our 404 handler and also our main application entry point. It takes the requested URI and translates it into a source file, tests if the source file exists, and then takes the source and processes it: it canonicalizes URIs from index.xhtml to index.html, it replaces <q> tags with curly quotes, it substitutes in the header, and it runs Geshi on the programming code (all of this done with the help of DOM and XPath). Then, it spits it into the new HTML file and serves the data itself.
The next time a client requests the file, Apache will direct them straight to the static HTML: no fussing about necessary. When the source XHTML gets updated, delete the compiled HTML and let the 404 handler do its magic. You could also set up a cron job to compare filemtime() between the source files, or set up a special GET flag that flushes the cache.
(End example)
This is by no means meant to replace dynamic systems, if you route everything through a PHP file you end up with a standard front-controller with filesystem caching CMS, but using this method we harness the power of Apache and mod_rewrite and let it take care of caching for us, gaining the convenience of PHP processing while removing the overhead.
So... thoughts? Comments? Prior art?
But they obviously have their limitations: no dynamic content, no templates, no guards to make sure you write standards-compliant code, no syntax highlighting, etc. To combat this problems, most CMSes out there have taken to redirecting all requests through PHP files, which generate the HTML on the fly and serve it to the browser. If you're lucky, you'll have a single PHP front-controller where all the requests get redirected.
I would like to propose a different approach: use Apache and htaccess as your front controller, and have it serve static HTML files, calling a PHP file to generate the HTML from a source XHTML file whenever some parameter is met.
Example:
You have demo.xhtml, which is your source file. It is a well-formed XHTML document, probably importing a few other namespaces which our post-processor will handle. Let's say it contains some text, uses <q> tags for quotations, contains the website's common header, and has some programming code.
When Apache receives a request for http://www.example.com/demo.html, a file which does not currently exist. So, Apache forwards the request along to main.php, our 404 handler and also our main application entry point. It takes the requested URI and translates it into a source file, tests if the source file exists, and then takes the source and processes it: it canonicalizes URIs from index.xhtml to index.html, it replaces <q> tags with curly quotes, it substitutes in the header, and it runs Geshi on the programming code (all of this done with the help of DOM and XPath). Then, it spits it into the new HTML file and serves the data itself.
The next time a client requests the file, Apache will direct them straight to the static HTML: no fussing about necessary. When the source XHTML gets updated, delete the compiled HTML and let the 404 handler do its magic. You could also set up a cron job to compare filemtime() between the source files, or set up a special GET flag that flushes the cache.
(End example)
This is by no means meant to replace dynamic systems, if you route everything through a PHP file you end up with a standard front-controller with filesystem caching CMS, but using this method we harness the power of Apache and mod_rewrite and let it take care of caching for us, gaining the convenience of PHP processing while removing the overhead.
So... thoughts? Comments? Prior art?