While I'm not following what you are giving with the 2 numbered items above, I have had to deal with a larger site with high traffic (44k unique visitors a month. which for my experience was high, I know others deal with more). The code it was built upon was quite wasteful of resources and not tested for high volume. (it was a new framework where I worked, so between working out performance issues and not being familiar with how to best use the code that was developed, it needed help).
The framework we used utilized Smarty, however, we (the developers at the company) were new to smarty and the proper way to use the caching available on it. So I sat down and figured out how to manage it better with my own system.
Key things to factor out:
--What parts of the site HAVE to be dynamic (ie, code that shows your current shopping cart, links to "Login / My Account" type of things), these will need to be handled outside of cached portions.
--What parts of the page are affected by other aspects of the site (ie. a Product category page listing the products. If you modify/add/remove a product listed on that page, that whole page needs rebuilt)
--What chunks of cached information can be cached separately (ie, do you have a main navigation that goes levels deep that is on more than one page) Not only did I cache finaly page output, I would also cache results or data (ie an array of all page links with their names)
Now, here is the route I was taking with my caching, there was a directory that contained all the cached data. If you went to say
http://example.com/product-my-widget It would look in the cache directory, if the file existed and it was within certain time frame (i like having them rebuilt after like a week) it will use the cached page, otherwise it would generate the page and save it out to cached page. I usually save the cache as a URL encoded request, (same as you would see in log files, GET(or POST) /path/path/file?query=string. This makes sure that they are all unique.
Now, back to the things that have to stay dynamic. In the above generating and caching, dynamic data is not in there, instead there is code that triggers functions to produce the dynamic data. These are done using the following format, which is function name followed by parameters passed to it (saved in Query string style, using urlencode):
Code: Select all
<div id="account-info">[[login_info]]</div>
<div id="cart-info">[[cart_items|pageid=44]]</div>
so then we do a preg_match_all to find these (this looks really nasty since the [ ] and | have special meaning in regex and have to get escaped)
Code: Select all
if (preg_match_all('/\[\[([^|\]]+)(\|([^\]]+))?\]\]/',$strPage,$dynamic)) {
$aryDynData = array(); // Contains dynamic output
// Load up the dynamic information
foreach($dynamic[0] as $index=>$key) {
if (!isset($aryDynData[$key])) {
// wasn't already used this page call, use it
$fnName = 'dyncode_'.$dynamic[1][$index];
if (function_exists($fnName)) {
$aryDynData[$key] = $$fnName(parse_str($dynamic[1][$index]));
}
else {
$aryDynData[$key] = '[ERROR: Could not find function '.$dynamic[1][$index].']';
}
}
}
// Put dynamic info into page:
foreach($aryDynData as $key=>$val) {
str_replace($key,$val,$strPage);
}
}
This code will find all instances of placeholders, and if the function exists (functions named:
dyncode_function_name) and process it and store it in an array. This way you can use it more than once on the page, but it only processes it once. If it can't find it, it will replace it with an error.
Here is sample code for the functions above. For demonstration purposes, the cart one gets passed the pageid, so you can do something like not make it clickable when you are actually on the cart page.
Code: Select all
function dyncode_login_info($params) {
if (isset($_SESSION['user'])) {
return 'Hello '.htmlspecialchars($_SESSION['user']['first_name'],ENT_QUOTES).'!';
}
else {
return '<a href="/login">Login</a>';
}
}
function dyncode_cart_items($params) {
$intPageID = (isset($params['pageid'])) ? (int)$params['pageid'] : 0;
$strCartInfo = '2 Items ($45.90)'; // Code to caclulate cart info
if (in_array($intPageID,array(44,122,244))) {
// This page is one that should only display the cart info, not link to cart
// (example, you are on the cart page or a checkout page)
return $strCartInfo;
}
else {
// All other pages, make this a link to the cart
return '<a href="/cart">'.$strCartInfo.'</a>';
}
}
So, now you have all that, and it is working, there is the issue of what to do when you change something. You mention that it takes a long time to "reindex" everything. I assume by this you are meaning rebuilding all the cached pages. Well here is the thing, there is no need to do that with the way I mentioned, as the system will make the cached copy the first time the page is called and a cached copy doesn't exist (or the cached copy is over week old).
So now, how do you determine what gets deleted when it comes to clearing out the cache. Easiest way, is just wipe all cache. If you are not editing pages/products much at all, just set it so that in your admin any time a _POST occurs on an editor page, wipe the directory. Otherwise, place it as a command in your admin to clear the directory.
Doing this majorly sped up the site I was working on. Unfortunately, the client didn't want to pay for it, and bossed didn't want me "wasting time" developing it. I may be a geek and write it all for my own experience, but company won't let me change the site for free, and needed permission to change a live site that large. Again, also not only did I cache output, I would cache data queries as well, so anywhere I needed the main nav, it read in the array from a file (it was stored serialized). A lot better than (in this case) several hundred queries.
Hope this information helps.
-Greg