I have been working on a way to back up some of my remote systems. In the past, I would have a timer/cron job to tar everything I wanted to back up, then use scp to distribute the tarballs to their off-site locations. This works well and is quite simple. However, it is inefficient for network bandwidth, and some of my VMs have a metered connection. Incremental updates with tar are not ideal (using “u” will store duplicate files) and restoration is very tedious (incrementals require restoration of main tar plus all incremental files, then a manual clean up of what was removed).
This made me think of using rsync instead. Rsync is great for, well, synchronizing files, but isn’t designed for snapshot type use. Plus instead of one tidy tar file, you have crap all over to worry about.
The solution I came up is a bit odd, but it has been working well. It also requires scratch space equivalent to the amount of space you are backing up. What I do is do an rsync pull from a VM located at the backup location to its own disk. This disk will hold a living copy of the data locally and rsync will only pull the changes, supports compression, so a lot of bandwidth is saved. Then, I use tar to take a snapshot of the synchronized folders, which I keep 30 days worth.
Security wise, it is quite secure, with everything occurring over SSH. Depending on the access required, the largest security flaw is sudo access to root requiring no password. If you are just backing up application data, and not system files, this is probably not required. Here is what is needed:
- Create (or use an existing) account with SSH keys and passwordless sudo access (if required) on the system that will be backed up.
- VM with enough disk space to hold a live copy of the backup data.
Steps:
- Do an rsync pull on the VM where the backups will be stored. Note: -p port can be added to the ssh string below, if required. –rsync-path=”sudo rsync” may be removed if sudo access is not required to back up your data.
rsync -ahP -e “ssh -o StrictHostKeyChecking=no -i /your/ssh/key/here” –delete –rsync-path=”sudo rsync” \
–compress –compress-level=9 \
user@host:/backup1/dir \
… as many as needed … \
/backup-scratch-directory > /backup/destination/directory/`date +%F`.log - You can then do a tar snapshot of the rsync directory:
tar -czpf /backup/destination/directory/`date %F`.tar.gz
- I then make sure to clean up old files with:
find /backup/destination/directory -type f -mtime +30 -exec rm {} \;
I run all this nightly so I end up with a directory full of 30 tarballs with their corresponding log files.