Page 1 of 1

How do you implement versioning?

Posted: Thu Oct 26, 2006 1:23 am
by alex.barylski
In designing any system, versioning is usually an important topic. As for you applications data, how do you typically implement versioning?

I mean high level as well. No using the DB native commit/rollback features. Something designed for your unique system so someone can, using an admin panel rifle through changes made, review comments and restore as required...

Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?

Perhaps some other method?

In anycase, I'd like to hear how you implement custom versioning in your applications...be detailed too please... :)

Cheers :)

Posted: Thu Oct 26, 2006 2:09 am
by Weirdan
Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?
Timestamped records.

Re: How do you implement versioning?

Posted: Thu Oct 26, 2006 3:26 am
by johno
Both versioning approaches mentioned have pros and cons.

If you timestamp records as Weirdan noted you have easy access to all of them at any time. But if one record is really big and changes are made really often space soon starts to be a problem.

On the other hand inserting only changed values/fields has the best space performance, but constructing a specific record version from that is pure pain.
Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table?
Maybe a nice trigger could do that automatically.

Posted: Thu Oct 26, 2006 11:03 am
by alex.barylski
Weirdan wrote:
Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?
Timestamped records.
Ok Weirdan...for the sake of brevity....I'm going to have to request you cut down on the number of characters when replying to me :P

Just kidding... :lol:

johno I wanted DB agnostic answers for a reason. Mostly because I am only familiar with triggers, etc. Also, not all databases support triggers I don't think? :P

I realize that each implementation has it's advantages and disadvantages, but this wasn't what I was after. I am sure there are other methods which one could use to implement CVS like functionality.

Cheers :)

Posted: Thu Oct 26, 2006 12:53 pm
by feyd
Diff(erential) information is generally the best to go with I would say. Supporting the storage of key-points (a.k.a. tagged) where the full version could be stored may minimize the backtracking one would have to do to retrieve a specific version.

It may be interesting to div into Subversion's or even CVS' code to see how they organize it all, although I know they store via differentials. Subversion's scheme supports differencing binaries as well.

Posted: Thu Oct 26, 2006 2:57 pm
by alex.barylski
While I agree that diff is certainly most efficient...it's also most complicated...especially when you want to implement versioning on countless tables...

Clearly I need to sit down and think about this one :P

Posted: Thu Oct 26, 2006 3:49 pm
by Ambush Commander
MediaWiki also has its own versioning system which is less featured than SVN or CVS. Also note that SVN and CVS are slow: getting logs can easily take 20 seconds, something unacceptable for a web application.

Posted: Sat Oct 28, 2006 2:36 pm
by alex.barylski
The diff algorithm is pretty complex...thats why snapshops appeal to me...the problem is, where one has excellent speed, the other does not...and where the other has excellent minimal storage requirements...the other does not...

So it's a real catch-22 :?

One method I'm thinking...is to store snapshots on an immediate basis...and during regularly maintenance schedules...you could also convert the snapshot into a calculated diff and store that instead...once converted you could then remove the snapshots permanently :)

Hmmmm...that actually might work 8)

Posted: Sat Oct 28, 2006 3:01 pm
by Ambush Commander
No need to be that complicated. You can always just gzip the content in the database (that's how MediaWiki does it). Also, note that snapshots offer redundancy.

If speed for retrieving diffs is absolutely necessary, their output can be cached.

Posted: Sat Oct 28, 2006 3:25 pm
by alex.barylski
Ambush Commander wrote:No need to be that complicated. You can always just gzip the content in the database (that's how MediaWiki does it). Also, note that snapshots offer redundancy.

If speed for retrieving diffs is absolutely necessary, their output can be cached.
You could gzip...I considered that...but still...most changes are trivial & occur frequently...so you'd be compressing *alot* of redundant data...

I'm not sure what you mean by snapshots providing redundancy???

Cheers

Posted: Sat Oct 28, 2006 3:35 pm
by Ambush Commander
If you only use differentials, and one of them is damaged, you theoretically lose everything after that data loss. Redundancy from, usually, caching the most recent revision, prevents that from happening.

Posted: Sat Oct 28, 2006 4:01 pm
by alex.barylski
Ambush Commander wrote:If you only use differentials, and one of them is damaged, you theoretically lose everything after that data loss. Redundancy from, usually, caching the most recent revision, prevents that from happening.
Ok, thats where I figured you were heading...and it's a valid point...but maybe that's being to pedantic?

I mean, we leave those details in the hands of a RDBMS for a reason right? If something were to corrupt one table...it's possible that same fire, earth quake, whatever...would likely destroy all other tables as well...

As for power failure, in that case, you could simply implement the versioning tables using InnoDB no?

Thanks for the feedback... :)