How do you implement versioning?

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

How do you implement versioning?

Post by alex.barylski »

In designing any system, versioning is usually an important topic. As for you applications data, how do you typically implement versioning?

I mean high level as well. No using the DB native commit/rollback features. Something designed for your unique system so someone can, using an admin panel rifle through changes made, review comments and restore as required...

Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?

Perhaps some other method?

In anycase, I'd like to hear how you implement custom versioning in your applications...be detailed too please... :)

Cheers :)
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?
Timestamped records.
User avatar
johno
Forum Commoner
Posts: 36
Joined: Fri May 05, 2006 6:54 am
Location: Bratislava/Slovakia
Contact:

Re: How do you implement versioning?

Post by johno »

Both versioning approaches mentioned have pros and cons.

If you timestamp records as Weirdan noted you have easy access to all of them at any time. But if one record is really big and changes are made really often space soon starts to be a problem.

On the other hand inserting only changed values/fields has the best space performance, but constructing a specific record version from that is pure pain.
Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table?
Maybe a nice trigger could do that automatically.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

Weirdan wrote:
Do you take a snaphot of an entire record before changes storing that snapshot in another mirrored table? Or do you break changes into atomic updates and store only those fields which are changed?
Timestamped records.
Ok Weirdan...for the sake of brevity....I'm going to have to request you cut down on the number of characters when replying to me :P

Just kidding... :lol:

johno I wanted DB agnostic answers for a reason. Mostly because I am only familiar with triggers, etc. Also, not all databases support triggers I don't think? :P

I realize that each implementation has it's advantages and disadvantages, but this wasn't what I was after. I am sure there are other methods which one could use to implement CVS like functionality.

Cheers :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Diff(erential) information is generally the best to go with I would say. Supporting the storage of key-points (a.k.a. tagged) where the full version could be stored may minimize the backtracking one would have to do to retrieve a specific version.

It may be interesting to div into Subversion's or even CVS' code to see how they organize it all, although I know they store via differentials. Subversion's scheme supports differencing binaries as well.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

While I agree that diff is certainly most efficient...it's also most complicated...especially when you want to implement versioning on countless tables...

Clearly I need to sit down and think about this one :P
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

MediaWiki also has its own versioning system which is less featured than SVN or CVS. Also note that SVN and CVS are slow: getting logs can easily take 20 seconds, something unacceptable for a web application.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

The diff algorithm is pretty complex...thats why snapshops appeal to me...the problem is, where one has excellent speed, the other does not...and where the other has excellent minimal storage requirements...the other does not...

So it's a real catch-22 :?

One method I'm thinking...is to store snapshots on an immediate basis...and during regularly maintenance schedules...you could also convert the snapshot into a calculated diff and store that instead...once converted you could then remove the snapshots permanently :)

Hmmmm...that actually might work 8)
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

No need to be that complicated. You can always just gzip the content in the database (that's how MediaWiki does it). Also, note that snapshots offer redundancy.

If speed for retrieving diffs is absolutely necessary, their output can be cached.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

Ambush Commander wrote:No need to be that complicated. You can always just gzip the content in the database (that's how MediaWiki does it). Also, note that snapshots offer redundancy.

If speed for retrieving diffs is absolutely necessary, their output can be cached.
You could gzip...I considered that...but still...most changes are trivial & occur frequently...so you'd be compressing *alot* of redundant data...

I'm not sure what you mean by snapshots providing redundancy???

Cheers
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

If you only use differentials, and one of them is damaged, you theoretically lose everything after that data loss. Redundancy from, usually, caching the most recent revision, prevents that from happening.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

Ambush Commander wrote:If you only use differentials, and one of them is damaged, you theoretically lose everything after that data loss. Redundancy from, usually, caching the most recent revision, prevents that from happening.
Ok, thats where I figured you were heading...and it's a valid point...but maybe that's being to pedantic?

I mean, we leave those details in the hands of a RDBMS for a reason right? If something were to corrupt one table...it's possible that same fire, earth quake, whatever...would likely destroy all other tables as well...

As for power failure, in that case, you could simply implement the versioning tables using InnoDB no?

Thanks for the feedback... :)
Post Reply