[PLUG] Dirvish backups are using *far* too much space

Rogan Creswick creswick at gmail.com
Mon Apr 13 21:20:34 UTC 2009


I started using Dirvish to manage backups of a couple important
directories at work in early March, and I just realized that the
nightly backups are taking up an enormous amount of space (about 30x
more than expected ;).

The vast majority of the content being backed up is static, but there
seems to be a fully copy of everything made each night.  (and our
backup server just ran out of space, so I don't think it's just my
eyes playing tricks on me.)

For example, my git_repos directory is about 48mb (total) and very,
very little has changed in there over the last month, yet running a du
on that vault gives me this:

$ du . -h --max-depth=1
12K	./dirvish
46M	./2009_02_10-1511
48M	./2009_03_05-1110
48M	./2009_03_06-0025
48M	./2009_03_07-0026
48M	./2009_03_08-0036
....


The dirvish set up is as follows:

--------------- /etc/dirvish/master.conf -------------------------------------

# /mnt/seabackup is an sshfs-mounted volume (the filesystem is ext3,
but that machine doesn't have dirvish installed, and I don't have root
on it):
bank:
	/mnt/seabackup/vaults

image-default: %Y_%m_%d-%H%M

log: gzip
index: gzip

# don't cross devices, just incase I mount something in home...
xdev: 0
expire-default: +3 months

# rsync-option is a list type:
rsync-option:
	--inplace

# Don't try to maintain permissions.  I don't have that kind of access
on seabackup.
permissions: 0

Runall:
	gnuwestlake-documents
	gnuwestlake-development
	gnuwestlake-git_repos
	gnuwestlake-workspace
	gnuwestlake-email
----------------------------------------------------------------------------------------------

Each vault has a config like this:

client: gnuwestlake
tree: /home/rcreswick/git_repos/


I've compared the output of `stat` on a couple files in the dirvish
trees, and with the "real" files in my live file system, and they all
point to different inodes (is that the right way to check for
hardlinks?) and the access/modify/change times on the live files that
really haven't changed are much older than the corresponding time
stamps in the dirvish trees.  My googling so far hasn't turned
anything up, and #dirvish seems fairly dead.

Thanks!
--Rogan



More information about the PLUG mailing list