[PLUG] Anyone know of "lossy" version-control software?

wcooley at nakedape.cc wcooley at nakedape.cc
Sun Dec 20 03:35:11 UTC 2009


I am looking for something that, as far as I have been able to find, does
not exist. I want a version-control package that intentionally and
programmatically destroys revisions.

Before I explain further, let me provide some background use-cases. We
maintain a lot of data automatically and usually do a plain-text dump for
back up and in some cases we stuff these dumps into Subversion, primarily
for ease of roll-back in case something should break. For example, we have
LDIF dumps of our LDAP tree; text dumps of our Cyrus IMAP mailbox list;
generated /etc/group files that are stored and distributed via Subversion,
etc. At other times, when I was subject to Windows DNS admins, I stored
periodic snapshots of DNS zones with RCS.

The problem with keeping this data in a traditional VCS like Subversion is
that I don't actually need all of those revisions. I really need hourly
dumps for the last few days for roll-back purposes. It is sometimes nice
to have a daily or weekly dump to use for data-mining purposes, to answer
questions like, "When did we start setting this attribute like this?" or
"At what rate has our mailbox list grown?" Having snapshots every 5
minutes for a couple hours wouldn't be bad either, but it's not really
feasible if they exist forever. If you think of the way rrdtool maintains
numerical time-series data, then the idea should seem pretty familiar.

So what I want is some sort of archival or version-control software that
can be configured for the programmed destruction of old versions and
keeping selected samples. Back-up software sort of does this and many
packages will backup to a file, but it seems like overkill to have to
setup and configure Bacula or Amanda just to manage a handful of files.

Part of the problem is that I just don't know what to call something like
this. "Round-robin archive" sounds good, but that's what rrdtool calls its
archive files. "Rolling-archive" also sounds right, but from what I can
tell that's a feature of the K2 WordPress UI theme. "Lossy VCS" or "lossy
SCM" leads only to snide comments about VisualSourceSafe.

I have used CVS and Subversion enough to know that getting them to throw
away revisions is painful. I've been using Git a lot lately and it
likewise seems unwilling to get rid of stuff. There are a lot of other
ones, like Arch, Darcs, Bazaar, Mercurial, etc., that I know little to
nothing of, but what I am asking for is something they are fundamentally
designed to not do. I've been wondering about something built on top of
the Git plumbing-layer, but do not have enough of a grasp of it to know if
it is possible.

Any ideas?

Wil




More information about the PLUG mailing list