Friday, May 2, 2008

Automatic Backup

We need to backup our stuff in case that some disasters might happen. One way is to backup to some external medias like tape, USB drives, or CD/DVDs. However working with them is somehow tedious and manual intervention is needed. People tend to be lazy therefore we are reluctant to backup if the process is inconvenient. If we have one additional machine with link in between (throughout this article, we assume the remote machine is named as remotebox), we can setup an automatic backup mechanism.

Overview

We use crontab to schedule rsync jobs to backup our files to another machine. Since rsync is secured with SSH, to suppress the password prompt, we have to setup SSH keychain.

Setup SSH Keychain

Generate Keys

This section mainly follows this nice guide.

We use the DSA approach.

$ ssh-keygen -t dsa
$ scp ~/.ssh/id_dsa.pub user@remotebox

Then log in to remotebox and append the public key to ~/.ssh/authorized_keys file like so:

cat id_dsa.pub >> ~/.ssh/authorized_keys
rm id_dsa.pub

Install Keychain

Keychain is a nice package which prompt you for your passphrase once you log into the system, and automatically provides passwords for later SSH session. Note this incurs security risks (there are always tradeoffs between security and convenience) therefore attention is needed.

This part mainly follows this guide.

First install keychain by typing emerge keychain.

Then add following lines to your ~/.bash_profile.

/keychain id_dsa
. ~/.keychain/$HOSTNAME-sh

Setup Crontab Job

We can then setup our automatic backup system now. Since there are almost infinite number of combinations of crontab and rsync patterns, I just shown one example below (many interesting examples of rsync usage can be found by typing man rsync).

0 9,16 * * 1-5 rsync -az -e ssh --delete ~/work/ remotebox:~/backup/

The above example synchronizes local directory ~/work to remote directory ~/backup on host remotebox, on 9 a.m. and 4 p.m. of every working day. Note that I have used option --delete, which will try to delete extraneous files from destination directory. This can ensure that the destination directory is an exact copy of source directory but it would be dangerous if the destination directory contains some files already. So test this option before you actually use it.

No comments:

Post a Comment