[PLUG] Favorite Backup Systems?

Russell Senior seniorr at aracnet.com
Fri May 21 16:35:01 UTC 2010


>>>>> "Tim" == Tim Wescott <tim at wescottdesign.com> writes:

Tim> I'm way bad -- it's been ages since I've backed up my machine,
Tim> and my it's gotten big in there!  I'm backing up now just by
Tim> copying to CD -- but those CD's have gotten awfully small now
Tim> that customers are sending me multiple 100-MB files of data.

Tim> Ubuntu has a bunch of backup options; I'd like to choose one that
Tim> does what I need without having so many cool options that I spend
Tim> a year experimenting with all the wrong ways to do it before I
Tim> hit on the one that actually works.

Tim> What is your preferred backup system?  I'd like something that
Tim> supports a scheme that lets me do a monthly complete backup with
Tim> daily incrementals.  I'd like to be able to tag certain
Tim> directories as "don't back up".  Something that could backup to a
Tim> remote disk would be nice, but not essential.  Automatic backup,
Tim> at least for the incrementals, would be especially nice, as I'm
Tim> absent minded.

Tim> Also nice would be something that lets me plug a big USB disk
Tim> into the cruddy old laptop that I'm using as a server (stupid, I
Tim> know -- I'll replace it when it dies).  I'm thinking that one way
Tim> or another I want to use a two- or three-disk backup system, so
Tim> that if any one of them dies I'll be left with a fairly recent
Tim> image of my computer.

Tim> Suggestions?

Dirvish is a layer on top of rsync.  I use rsync directly.  The cool
thing about rsync is that incrementals and full backups are the same.
Every time you take an rsync snapshot (using the --link-dest option),
you get a parallel tree of directories with common files hardlinked
across. 

You can blow away old snapshots in any order you like, and the
filesystem (just by virtue of how it normally works) keeps track of
decrementing link counts and removing only the files that were unique
to the blown-away tree.  The main disadvantage to this is that it's
a little hard to tell how much space you are really going to free by
blowing away a particular snapshot.

You should be aware that USB on a cruddy old laptop is going to be
(probably) a cruddy old USB speed also, and that backups over it are
going to be slow.  USB2 is going to be better, eSATA is going to be
pretty fast, if you can get it.

The rsync command looks something like this:

  rsync -v -a -H --numeric-ids --delete --exclude=/sys/* --exclude=/proc/* --exclude=/backups/* --link-dest=../<last-snapshot-name>/ / /backups/<new-snapshot-name>/

assuming you mount your backup media at /backups/.  I tend to name my
snapshots with timestamps, which makes identifying the
last-snapshot-name and creating a new-snapshot-name easily
scriptable, eg:  YYYY-MM-DD-HHmm.  Note, the trailing slash is
significant, see rsync manpage for details.

Keeping track of hardlinks can be a burden on rsync while it is making
the backup and can chew up RAM during the run.  

Also, you want to avoid ever writing to the files in the backup tree,
since some applications are going to modify the files in-place.  Since
they may be hardlinked across snapshots, modifying them in one tree
will also modify them in all the other trees.  Be sure to treat the
backups as readonly, except when deleting old snapshots or creating
new ones.  On an NFS or Samba server, you can export the backups as
readonly and safely make them available that way.

Also, where possible, avoid situations where a large file changes a
little bit frequently.  For example, large mboxes are bad.  Large,
traditionally renaming-rotating log files are bad (timestamped
logfiles are better).  Renaming large trees is bad.  Rsync is
comparing the-same-filename in the two trees to see if they are
different at all, if they are you get a new copy, chewing up space.

Once you have a near copy on the other side of a network, remote
backups with this method become not so painful also.  Since most of
the content is the same between snapshots, the amount that has to go
over the network is small.  As long as you have remote ssh, you can
maintain offsite backups without having to move media.


-- 
Russell Senior         ``I have nine fingers; you have ten.''
seniorr at aracnet.com



More information about the PLUG mailing list