Page 1 of 2

Best pattern for caching objects?

Posted: Wed Jan 28, 2004 12:27 am
by seine
Hello

In a web environment, what is the best method for caching objects?

I'm dubious about hitting the database too frequently and would like to construct commonly used objects only once in the lifetime of the webserver.

What is the best pattern for creating an object and having it available for all HTTP sessions? Is it even possible?

Cheers
Jem

ps. this is not x-posted. I have moved it from the code forum.

Posted: Wed Jan 28, 2004 12:49 am
by lazy_yogi
You might find this interesting.
I sure did.
It deals with why smarty is pointless as a templating engine since php does it all the same anyway if you know the concepts of templating.

It also has a way to cache. I didn't look very closely at how it does it tho.

http://www.sitepoint.com/article/1218/1

Re: Best pattern for caching objects?

Posted: Wed Jan 28, 2004 9:48 am
by McGruff
seine wrote:Hello

In a web environment, what is the best method for caching objects?
In a scripting language like php you've always got to build everything from scratch.

However, performance is rarely an issue. Go ahead and hit that database. Beat it, bash it, smash it, crash it. It likes hard work. Wait and see if you have a problem before trying to fix it.

Re: Best pattern for caching objects?

Posted: Wed Jan 28, 2004 3:07 pm
by lazy_yogi
McGruff wrote:In a scripting language like php you've always got to build everything from scratch.
Or if you're want the simple option, use libraries created by others that have been tested and well designed.
McGruff wrote:However, performance is rarely an issue. Go ahead and hit that database. Beat it, bash it, smash it, crash it. It likes hard work. Wait and see if you have a problem before trying to fix it.
Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load. Also speeds up delivery of the page to the clients - which is important in this day of "I want it NOW" attitute (which I also have :D )

Posted: Wed Jan 28, 2004 11:35 pm
by McGruff
Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load. Also speeds up delivery of the page to the clients - which is important in this day of "I want it NOW" attitute (which I also have )
It is extremely counter-productive to waste time worrying over performance. You should not worry about this while you're coding (assuming you're not doing anything totally crazy). Just try to write good, maintainable code.

Wait and see how the finished product performs. Think about optimising if you really need to - and if it isn't cheaper and easier just to get new hardware.

Posted: Thu Jan 29, 2004 3:39 am
by lazy_yogi
Yeh I agree that writing maintainable code is the most important thing.

But you should keep in mind any possible issues you may have, because you might end up having to re-write the system if your'e not careful.

And caching is not complicated at all. I read that link more closely and its absurdly easy and will reduce server load enourmously.

Posted: Thu Jan 29, 2004 10:48 am
by ilovetoast
I'll add my support for caching as well. I think that a caching system, such as that found in either of the PEAR Cache classes or in the Site Point article, is simple and effective way to improve the performance of some projects.

Specifically, I find them most useful in projects where the data changes, but not that often. If the data is only changing a couple times per day or less and you're getting a lot of page loads (thousands per day), caching can be a great help.

I didn't read through the Site Point code as that d*** site is apparently made by someone with no concept of text contrast issues. I mean really... pink on white? I'm not even going to try to squint through that code. I'm assuming from the description and surrounding text its the same concept as the PEAR Cache classes.

peace

Lord Whitewash Hutton

Posted: Thu Jan 29, 2004 10:09 pm
by timvw
Instead of reinventing the wheel, whats wrong with using the caching modules in apache?

Posted: Fri Jan 30, 2004 12:39 am
by McGruff
Are we talking about page caching or caching objects?

Sure, there's no point in serving pages dynamically if the content doesn't change often. Probably most pages on most sites are better wrapped in a buffer & written to file.

Objects could be serialized and stored in a file/db for use by each and every http request (if that's what the original poster meant). The object doesn't persist exactly - it stlll has to be rebuilt in much the same way that an object is instantiated in a normal script. While different http requests can all instantiate from the same class, they can't share a single object - at least not if the request processing might change the object's state.

What exactly do you want to achieve?

Posted: Sun Feb 01, 2004 4:10 am
by seine
The op (me) wants to achieve global access to object instances.

Every visitor to a particular page will see a different dynamic page, but many of the underlying objects that are used to populate that page are configurable, but basically never change. Currently I'm recreating the objects from the datastore with each page request. I want to create and cache on first access and use the cached instances from that time on.

Why? Because I believe it's a better software engineering practice than just bashing at the database. Also, I want to know if it can be done.

Thanks for your discussions guys.
Jem

Posted: Sun Feb 01, 2004 9:03 am
by lazy_yogi
I haven't seen the pear module, but the caching concept is easy to follow in the siteopint article.
There are 2 classes - each about 30 lines (most of it comments).
The second class is the one that deals with caching.

Not sure about caching objects, but caching pages - or parts of pages - is done in an elegantly simply way there. So to avoid excessive db access for display of info, that would be the way to go.

Of course, the pear module might do it also. But to understand how its done, you might wanna see the sitepoint one, as its very very basic and easy to understand - largely cuz its so small.

Posted: Sun Feb 01, 2004 9:07 am
by timvw
I haven't seen any programs that provide you with a pool of objects that live all the time (But that doesn't want to say that they do not exist). I do know about some programs that provide you with such pools (fe: jboss appserver) but they all seem to have a database backend.

Posted: Wed Feb 11, 2004 6:52 am
by BDKR
seine wrote:The op (me) wants to achieve global access to object instances.

Every visitor to a particular page will see a different dynamic page, but many of the underlying objects that are used to populate that page are configurable, but basically never change. Currently I'm recreating the objects from the datastore with each page request. I want to create and cache on first access and use the cached instances from that time on.

Why? Because I believe it's a better software engineering practice than just bashing at the database. Also, I want to know if it can be done.

Thanks for your discussions guys.
Jem
I agree. PHP could really use two things.

1) A way to store it's/a compiled state (like TCL for example)
2) Or something akin to an application server that would provide app vars
(including objects)

There is one project to create a PHP application server (that I know of). That's SRM, but it's been dead for some time. I emailed one of the developers and he says they are going to pick up the pace again sometime this year.

Another is something newer and really isn't intended to be an application server as such, but provides some cool things just the same. It's called Continuity and can be found at http://www.ashpool.com/software.php.

Short of the above mentioned, Turck (the accellerator) allows one to store variables in shared memory and provides an API for getting at and accessing that information. The only thing it doesn't do, which would be awesome, is store database connection resources.

On the other hand, this is something I think Continuity is capable of. I'm not sure.

Anyways, give Turck MMcache a look see.
http://turck-mmcache.sourceforge.net/

Beyond that, there are third party (money required) solutions such as ObjectCache which may or may not work with PHP. It can be found at http://www.objectstore.net/products/obj ... /index.ssp. I'm also not really sure if it requres the use of an ODBMS.

Cheers,
BDKR

Posted: Wed Feb 11, 2004 6:54 am
by BDKR
timvw wrote:Instead of reinventing the wheel, whats wrong with using the caching modules in apache?
Do you have a link to any info on this?

Cheers,
BDKR

Posted: Sat Feb 21, 2004 1:25 am
by eletrium
Ech, couple comments...

"The op (me) wants to achieve global access to object instances. "

global = evil

Avoid globals in general.

-----------

"Why? Because I believe it's a better software engineering practice than just bashing at the database. "

Not necessarily. "Good" software enginerring practice is getting a working program that satisfies the customer's needs. Overanalyzing and trying to get a perfect program is impossible, so you just write something, and then you test it. I can't tell you how % of things like this I hear about only to run a test and see that there really is no issue.

For example, if you spend too much focus on this, and you miss an index on your database, then it won't matter what you do, data access will be slow (yes, this happened in one case). Get the overall program running, then identify specific problems, and then define what would make each problem "fixed" and then address those specific problems. As you get more experience, you simply will have less and less problems in your programs. But the key here is to just do it then test it.

In other words, you're trying to solve a problem that you have no clue if it is a problem even. Trust me, there are plenty of problems out there, but you need to be sure it's a problem first.

My last company was ALWAYS worried about our software being too slow. So there was CONSTANTLY a push to write more "efficient" code, even at the expense of readability and useabaility and re-useability. Problem one was that 95% of our time was in getting data from the database, not in our code... so they wanted to cache everything. Did nothing. Period.

Why? Because the effectiveness on the cache was not the right solution. Why? Because it was the right solution, but not to the problem we had. The problem we had (found by our resident nutjobber using VTune) was that the database drivers themselves were the issue. The timings in our code was fine, the SQL we ran on command line was fine on the servers, but when the same SQL ran through the drivers on the clients, it was slow as hell. The solution? Would have been to write a custom driver.

Lastly and MOST important, ANY time you are doing something to "speed something up" it is ENTIRELY worthless to do it until you make a test program to get timings. Unless you can say MethodA runs in X time and MethodB runs in Y time, its useless. More often then not, when you make your test program and get the timing, you'll probably find out it's fast enough already anyways.

---------

"It is extremely counter-productive to waste time worrying over performance. You should not worry about this while you're coding (assuming you're not doing anything totally crazy). Just try to write good, maintainable code. "

Smart man. You know there is going to be a problem somewhere. It may well be performance. But chances are not. Can tell you're a professional McGruff.

----------
"Wait and see how the finished product performs. Think about optimising if you really need to - and if it isn't cheaper and easier just to get new hardware."

If you charge your client say, 50 bucks an hour for your labor, and it will take a week to fix something, you just charged them 2000 bucks. For 800 bucks I can build you a computer that would blow your socks off... you'd be surprised how often a problem with performance can be solved just by upgrading the hardware. It's business, not perfection in code.

-----------

"Hmm .. if you're expecting hundreds of thousands of hits a day, caching is extremely important to reduce the massive server load."

Huh? Need to be more specific. What kind of implementation are we talking about? The most important caching is on the database itself. Tuning the database properly is everything. If a certain query is run often, the database should be caching it anyways. Not to mention if it is run often, the query should be hitting proper indices. The variance in speed of queries with changing the query a little or changing a setting on the database server is astounding. Apache is the same... the place to address this is in the database itself or Apache. Trying to share data/objects between users potentially across the world (am I really reading that right?) to save a few database hits is not what I would think would be a good start point.