Best pattern for caching objects?
Moderator: General Moderators
Best pattern for caching objects?
Hello
In a web environment, what is the best method for caching objects?
I'm dubious about hitting the database too frequently and would like to construct commonly used objects only once in the lifetime of the webserver.
What is the best pattern for creating an object and having it available for all HTTP sessions? Is it even possible?
Cheers
Jem
ps. this is not x-posted. I have moved it from the code forum.
In a web environment, what is the best method for caching objects?
I'm dubious about hitting the database too frequently and would like to construct commonly used objects only once in the lifetime of the webserver.
What is the best pattern for creating an object and having it available for all HTTP sessions? Is it even possible?
Cheers
Jem
ps. this is not x-posted. I have moved it from the code forum.
You might find this interesting.
I sure did.
It deals with why smarty is pointless as a templating engine since php does it all the same anyway if you know the concepts of templating.
It also has a way to cache. I didn't look very closely at how it does it tho.
http://www.sitepoint.com/article/1218/1
I sure did.
It deals with why smarty is pointless as a templating engine since php does it all the same anyway if you know the concepts of templating.
It also has a way to cache. I didn't look very closely at how it does it tho.
http://www.sitepoint.com/article/1218/1
Re: Best pattern for caching objects?
In a scripting language like php you've always got to build everything from scratch.seine wrote:Hello
In a web environment, what is the best method for caching objects?
However, performance is rarely an issue. Go ahead and hit that database. Beat it, bash it, smash it, crash it. It likes hard work. Wait and see if you have a problem before trying to fix it.
Re: Best pattern for caching objects?
Or if you're want the simple option, use libraries created by others that have been tested and well designed.McGruff wrote:In a scripting language like php you've always got to build everything from scratch.
Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load. Also speeds up delivery of the page to the clients - which is important in this day of "I want it NOW" attitute (which I also haveMcGruff wrote:However, performance is rarely an issue. Go ahead and hit that database. Beat it, bash it, smash it, crash it. It likes hard work. Wait and see if you have a problem before trying to fix it.
It is extremely counter-productive to waste time worrying over performance. You should not worry about this while you're coding (assuming you're not doing anything totally crazy). Just try to write good, maintainable code.Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load. Also speeds up delivery of the page to the clients - which is important in this day of "I want it NOW" attitute (which I also have )
Wait and see how the finished product performs. Think about optimising if you really need to - and if it isn't cheaper and easier just to get new hardware.
Yeh I agree that writing maintainable code is the most important thing.
But you should keep in mind any possible issues you may have, because you might end up having to re-write the system if your'e not careful.
And caching is not complicated at all. I read that link more closely and its absurdly easy and will reduce server load enourmously.
But you should keep in mind any possible issues you may have, because you might end up having to re-write the system if your'e not careful.
And caching is not complicated at all. I read that link more closely and its absurdly easy and will reduce server load enourmously.
-
ilovetoast
- Forum Contributor
- Posts: 142
- Joined: Thu Jan 15, 2004 7:34 pm
I'll add my support for caching as well. I think that a caching system, such as that found in either of the PEAR Cache classes or in the Site Point article, is simple and effective way to improve the performance of some projects.
Specifically, I find them most useful in projects where the data changes, but not that often. If the data is only changing a couple times per day or less and you're getting a lot of page loads (thousands per day), caching can be a great help.
I didn't read through the Site Point code as that d*** site is apparently made by someone with no concept of text contrast issues. I mean really... pink on white? I'm not even going to try to squint through that code. I'm assuming from the description and surrounding text its the same concept as the PEAR Cache classes.
peace
Lord Whitewash Hutton
Specifically, I find them most useful in projects where the data changes, but not that often. If the data is only changing a couple times per day or less and you're getting a lot of page loads (thousands per day), caching can be a great help.
I didn't read through the Site Point code as that d*** site is apparently made by someone with no concept of text contrast issues. I mean really... pink on white? I'm not even going to try to squint through that code. I'm assuming from the description and surrounding text its the same concept as the PEAR Cache classes.
peace
Lord Whitewash Hutton
Are we talking about page caching or caching objects?
Sure, there's no point in serving pages dynamically if the content doesn't change often. Probably most pages on most sites are better wrapped in a buffer & written to file.
Objects could be serialized and stored in a file/db for use by each and every http request (if that's what the original poster meant). The object doesn't persist exactly - it stlll has to be rebuilt in much the same way that an object is instantiated in a normal script. While different http requests can all instantiate from the same class, they can't share a single object - at least not if the request processing might change the object's state.
What exactly do you want to achieve?
Sure, there's no point in serving pages dynamically if the content doesn't change often. Probably most pages on most sites are better wrapped in a buffer & written to file.
Objects could be serialized and stored in a file/db for use by each and every http request (if that's what the original poster meant). The object doesn't persist exactly - it stlll has to be rebuilt in much the same way that an object is instantiated in a normal script. While different http requests can all instantiate from the same class, they can't share a single object - at least not if the request processing might change the object's state.
What exactly do you want to achieve?
The op (me) wants to achieve global access to object instances.
Every visitor to a particular page will see a different dynamic page, but many of the underlying objects that are used to populate that page are configurable, but basically never change. Currently I'm recreating the objects from the datastore with each page request. I want to create and cache on first access and use the cached instances from that time on.
Why? Because I believe it's a better software engineering practice than just bashing at the database. Also, I want to know if it can be done.
Thanks for your discussions guys.
Jem
Every visitor to a particular page will see a different dynamic page, but many of the underlying objects that are used to populate that page are configurable, but basically never change. Currently I'm recreating the objects from the datastore with each page request. I want to create and cache on first access and use the cached instances from that time on.
Why? Because I believe it's a better software engineering practice than just bashing at the database. Also, I want to know if it can be done.
Thanks for your discussions guys.
Jem
I haven't seen the pear module, but the caching concept is easy to follow in the siteopint article.
There are 2 classes - each about 30 lines (most of it comments).
The second class is the one that deals with caching.
Not sure about caching objects, but caching pages - or parts of pages - is done in an elegantly simply way there. So to avoid excessive db access for display of info, that would be the way to go.
Of course, the pear module might do it also. But to understand how its done, you might wanna see the sitepoint one, as its very very basic and easy to understand - largely cuz its so small.
There are 2 classes - each about 30 lines (most of it comments).
The second class is the one that deals with caching.
Not sure about caching objects, but caching pages - or parts of pages - is done in an elegantly simply way there. So to avoid excessive db access for display of info, that would be the way to go.
Of course, the pear module might do it also. But to understand how its done, you might wanna see the sitepoint one, as its very very basic and easy to understand - largely cuz its so small.
I agree. PHP could really use two things.seine wrote:The op (me) wants to achieve global access to object instances.
Every visitor to a particular page will see a different dynamic page, but many of the underlying objects that are used to populate that page are configurable, but basically never change. Currently I'm recreating the objects from the datastore with each page request. I want to create and cache on first access and use the cached instances from that time on.
Why? Because I believe it's a better software engineering practice than just bashing at the database. Also, I want to know if it can be done.
Thanks for your discussions guys.
Jem
1) A way to store it's/a compiled state (like TCL for example)
2) Or something akin to an application server that would provide app vars
(including objects)
There is one project to create a PHP application server (that I know of). That's SRM, but it's been dead for some time. I emailed one of the developers and he says they are going to pick up the pace again sometime this year.
Another is something newer and really isn't intended to be an application server as such, but provides some cool things just the same. It's called Continuity and can be found at http://www.ashpool.com/software.php.
Short of the above mentioned, Turck (the accellerator) allows one to store variables in shared memory and provides an API for getting at and accessing that information. The only thing it doesn't do, which would be awesome, is store database connection resources.
On the other hand, this is something I think Continuity is capable of. I'm not sure.
Anyways, give Turck MMcache a look see.
http://turck-mmcache.sourceforge.net/
Beyond that, there are third party (money required) solutions such as ObjectCache which may or may not work with PHP. It can be found at http://www.objectstore.net/products/obj ... /index.ssp. I'm also not really sure if it requres the use of an ODBMS.
Cheers,
BDKR
Ech, couple comments...
"The op (me) wants to achieve global access to object instances. "
global = evil
Avoid globals in general.
-----------
"Why? Because I believe it's a better software engineering practice than just bashing at the database. "
Not necessarily. "Good" software enginerring practice is getting a working program that satisfies the customer's needs. Overanalyzing and trying to get a perfect program is impossible, so you just write something, and then you test it. I can't tell you how % of things like this I hear about only to run a test and see that there really is no issue.
For example, if you spend too much focus on this, and you miss an index on your database, then it won't matter what you do, data access will be slow (yes, this happened in one case). Get the overall program running, then identify specific problems, and then define what would make each problem "fixed" and then address those specific problems. As you get more experience, you simply will have less and less problems in your programs. But the key here is to just do it then test it.
In other words, you're trying to solve a problem that you have no clue if it is a problem even. Trust me, there are plenty of problems out there, but you need to be sure it's a problem first.
My last company was ALWAYS worried about our software being too slow. So there was CONSTANTLY a push to write more "efficient" code, even at the expense of readability and useabaility and re-useability. Problem one was that 95% of our time was in getting data from the database, not in our code... so they wanted to cache everything. Did nothing. Period.
Why? Because the effectiveness on the cache was not the right solution. Why? Because it was the right solution, but not to the problem we had. The problem we had (found by our resident nutjobber using VTune) was that the database drivers themselves were the issue. The timings in our code was fine, the SQL we ran on command line was fine on the servers, but when the same SQL ran through the drivers on the clients, it was slow as hell. The solution? Would have been to write a custom driver.
Lastly and MOST important, ANY time you are doing something to "speed something up" it is ENTIRELY worthless to do it until you make a test program to get timings. Unless you can say MethodA runs in X time and MethodB runs in Y time, its useless. More often then not, when you make your test program and get the timing, you'll probably find out it's fast enough already anyways.
---------
"It is extremely counter-productive to waste time worrying over performance. You should not worry about this while you're coding (assuming you're not doing anything totally crazy). Just try to write good, maintainable code. "
Smart man. You know there is going to be a problem somewhere. It may well be performance. But chances are not. Can tell you're a professional McGruff.
----------
"Wait and see how the finished product performs. Think about optimising if you really need to - and if it isn't cheaper and easier just to get new hardware."
If you charge your client say, 50 bucks an hour for your labor, and it will take a week to fix something, you just charged them 2000 bucks. For 800 bucks I can build you a computer that would blow your socks off... you'd be surprised how often a problem with performance can be solved just by upgrading the hardware. It's business, not perfection in code.
-----------
"Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load."
Huh? Need to be more specific. What kind of implementation are we talking about? The most important caching is on the database itself. Tuning the database properly is everything. If a certain query is run often, the database should be caching it anyways. Not to mention if it is run often, the query should be hitting proper indices. The variance in speed of queries with changing the query a little or changing a setting on the database server is astounding. Apache is the same... the place to address this is in the database itself or Apache. Trying to share data/objects between users potentially across the world (am I really reading that right?) to save a few database hits is not what I would think would be a good start point.
"The op (me) wants to achieve global access to object instances. "
global = evil
Avoid globals in general.
-----------
"Why? Because I believe it's a better software engineering practice than just bashing at the database. "
Not necessarily. "Good" software enginerring practice is getting a working program that satisfies the customer's needs. Overanalyzing and trying to get a perfect program is impossible, so you just write something, and then you test it. I can't tell you how % of things like this I hear about only to run a test and see that there really is no issue.
For example, if you spend too much focus on this, and you miss an index on your database, then it won't matter what you do, data access will be slow (yes, this happened in one case). Get the overall program running, then identify specific problems, and then define what would make each problem "fixed" and then address those specific problems. As you get more experience, you simply will have less and less problems in your programs. But the key here is to just do it then test it.
In other words, you're trying to solve a problem that you have no clue if it is a problem even. Trust me, there are plenty of problems out there, but you need to be sure it's a problem first.
My last company was ALWAYS worried about our software being too slow. So there was CONSTANTLY a push to write more "efficient" code, even at the expense of readability and useabaility and re-useability. Problem one was that 95% of our time was in getting data from the database, not in our code... so they wanted to cache everything. Did nothing. Period.
Why? Because the effectiveness on the cache was not the right solution. Why? Because it was the right solution, but not to the problem we had. The problem we had (found by our resident nutjobber using VTune) was that the database drivers themselves were the issue. The timings in our code was fine, the SQL we ran on command line was fine on the servers, but when the same SQL ran through the drivers on the clients, it was slow as hell. The solution? Would have been to write a custom driver.
Lastly and MOST important, ANY time you are doing something to "speed something up" it is ENTIRELY worthless to do it until you make a test program to get timings. Unless you can say MethodA runs in X time and MethodB runs in Y time, its useless. More often then not, when you make your test program and get the timing, you'll probably find out it's fast enough already anyways.
---------
"It is extremely counter-productive to waste time worrying over performance. You should not worry about this while you're coding (assuming you're not doing anything totally crazy). Just try to write good, maintainable code. "
Smart man. You know there is going to be a problem somewhere. It may well be performance. But chances are not. Can tell you're a professional McGruff.
----------
"Wait and see how the finished product performs. Think about optimising if you really need to - and if it isn't cheaper and easier just to get new hardware."
If you charge your client say, 50 bucks an hour for your labor, and it will take a week to fix something, you just charged them 2000 bucks. For 800 bucks I can build you a computer that would blow your socks off... you'd be surprised how often a problem with performance can be solved just by upgrading the hardware. It's business, not perfection in code.
-----------
"Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load."
Huh? Need to be more specific. What kind of implementation are we talking about? The most important caching is on the database itself. Tuning the database properly is everything. If a certain query is run often, the database should be caching it anyways. Not to mention if it is run often, the query should be hitting proper indices. The variance in speed of queries with changing the query a little or changing a setting on the database server is astounding. Apache is the same... the place to address this is in the database itself or Apache. Trying to share data/objects between users potentially across the world (am I really reading that right?) to save a few database hits is not what I would think would be a good start point.