Page 1 of 1

Cache Concurrency Control

Posted: Wed Jun 17, 2009 1:10 pm
by dbsights
Hey, I am currently considering cool new concepts for cache concurrency control and I would like your opinions on them. This solution would be implemented on memcached running over mysql. Basically, I need a way of being certain that no writes are 'lost' while maintaining as high performance, low overhead, and as much simplicity as possible.

So as for concurrency, here are my ideas compared:

1. Object locking. All objects in memcached contain a simple flag that when set, prevents scripts from reading them. This flag is set after a read is made that will be acted upon (will result in a write). Once the write is complete, the lock is removed. (Variant: the flag only blocks other writing processes, not read-only processes).

PROS
Low overhead
Simple

CONS
What to do with blocked reads?

2. Property locking. Identical to above but with a separate flag for each property

PROS
Reduces instances of blocked reads when many properties are in demand

CONS
Larger overhead

3. Object versioning. For each object in memcached, there is also a 'super object' containing meta data about the object it manages. Whenever a process needs to read an object, it first reads the super object associated with that object to find the most up to date version of the object and reads it unconditionally. If this process was reading with intent to write, it also writes the time that it read the source into the super object. When the process needs to write to the object it instead creates a new object with the modified properties. It then updates the super object entry (containing the read time) with a reference to the new object and the time it wrote. Finally, it parses the super object, destroying any out of date objects and setting the new source as the most recent version; combining versions if necessary by taking only the differences. If identical properties have been altered, the property that was written latest is taken.

PROS
Solution to write blocking on object

CONS
Double access time for reads and writes
Additional overhead to parse super object
What if multiple processes overwrite the super object?

4. Property versioning. A mix of all the above methods. Every property on an object has an associated timestamp. When a process wants to read a property from an object, it first records the time, then it reads the property. When writing, the process checks that the timestamp is not greater than the read time. If it is, it rereads the property and recalculates. Then, if the timestamp is less than the read time, it writes the property and the time.

PROS
Simple, holistic solution

CONS
High overhead
Code may run longer (possibly a lot, depends on the demand for that object. This is only for read/write, reads will progress at normal speed)

So what do you think, or do you have a better solution of your own. Right now I am leaning towards either #1 or #4. Maybe multiple systems could be used in a framework of some kind so that objects only got as much write-protection as they needed...