Backups using rsync¶
Background¶
We want all data to have a “single source of truth”. As such, important data has its reference location at only one site. All day-to-day usage of that data is undergone via network traffic to that reference location. The downside of this is that a network interruption, such as the remote data storage location having a power outage, makes this data unavailable.
As such, just-in-case, important remote datastores are synced locally. This document details an example of one such local sync.
Pre-requisites¶
SSH Tunnels forwarding through the SAMBA port of the remote file share
For the purpose of this document it will be assumed that this share is accessible at
rccc-ssh/Physicsat port44448. The directory to be backed up isPhysics
A local SAMBA share for storing the backup
For the purpose of this document it will be assumed that this share is accessible at
rccc-ssh/D. The directory to back up to isPhysicsDriveBackup
A username and password that is able to access both SAMBA shares
For the purpose here, this username will be
pexitand the remote share will be on the domainnbccc, and the local share will be on the domainrccc.
An Ubuntu 20.04 instance with access to both the forwarded SAMBA share and the local SAMBA share
For the purpose here, this instance is a VM within Hyper-V with user login name
pexit.
Overview¶
The general approach here will be to:
Create the permanent SAMBA mount points via fstab
Set up rsync to run via cron
Setup rsync crontab¶
These instructions for setting up rsync are adapted from
https://www.howtogeek.com/135533/how-to-use-rsync-to-backup-your-data-on-linux/
To set up the crontab run crontab -e, then append the following to the
bottom of that file:
0 1 * * * mount /media/rccc-ssh/D ; mount /media/tunnel-nbcc-pdc/Physics ; timeout 4h rsync -av --delete /media/tunnel-nbcc-pdc/Physics/Physics/ /media/rccc-ssh/D/PhysicsDriveBackup/
This will set up cron to make sure the appropriate directories are mounted and
then runs rsync each night at 1 am. If the task hasn’t completed by 5 am it
is stopped ready for it to continue the task on the following night.