Edit: The script in this article has been superseeded by the Linux Time Machine project on GitHub
Time Machine is Apple’s backup solution for Mac computers. It makes incremental backups and provides a gui (the star field) for browsing and restoring data from arbitrary points in time. I use it for my Macs and it is nice, although the star field is limited since it provides no terminal access.
Ok, but my important computers run Linux and I want something similar for them. No need for the gui, but incremental backups that are easy to access. The magic bullet is to use rsync with the
--link-dest= option. This makes rsync use hard links for every file that is unchanged between backup versions. A hard link takes virtually no space, so the backup will not use more space than necessary for all files + whatever changes occur between versions.
Here’s how I did it, together with a small script to get you started.
- rsync. It is probably already installed on your Linux machine. Else, install it through your package manager.
- Somewhere to put the backups. I use a NAS from QNAP, but it could be any kind of external storage. Please note that the target filesystem must support hard links. Microsoft FAT-32 does not.
Configuring the script
My backup storage is mounted over NFS:
$BACKUP_MOUNTPOINT will be
/mnt/qnap-backup. You could also use ssh to backup over the network. After setting up ssh-keys for password-less connection,
$BACKUP_MOUNTPOINT will be something like
$BACKUP_EXCLUDE is the path to a file specifying what to exclude from the backup. Basically it contains exclude-patterns, one per line. See
man rsync for details. Mine contains entries like:
$BACKUP_EXCLUDE. Schedule it to run (as root) periodically. I run my every night at 3 am. This is what it looked like after a few nights (piano is my hostname):
piano:/mnt/qnap-backup$ ls -1 piano-backup-2013-02-20 piano-backup-2013-02-21 piano-backup-2013-02-22 piano-backup-2013-02-23 piano-backup-2013-02-24 piano-backup-2013-02-25 piano-backup-2013-02-26 piano-backup-current
The last entry is a symlink to the most current backup. This is important, since every new backup is compared to the current one, and only new or changed files are actually stored on the filesystem. Everything else is hard-linked, meaning that it takes virtually no extra space.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
This was just to get you started. The script should probably be extended to do things like:
- Erase old backups over a certain age.
- Check that the backup storage is not full.
- Log success or failure somewhere.
- Alert you if there is a problem.