Automate AWS EC2 Backups (with Python and Virtualenv) (almost)

This post will walk you through automating the AWS snapshot function to snapshot your AWS volumes.  I say "almost" in the post title because although the AWS CLI tools are Python-based, the backup script itself is in Bash. If I knew anything about coding, I would probably convert it to Python.  I should add that to the list.

I have a confession (other than the fact that I am absolutely terrible at coding): I am not really good about backing up my data.  I have gotten better over the years and recently got around to using my CrashPlan account to back up my data at home, but there's always room for improvement.

Oh, crap

Case in point, last week I realized that I had made a bunch of posts to this blog, which lives on an AWS instance, but that I wasn't backing up the instance at all. So I added it to the list of things to do.

Then I woke up this morning and found the instance down.  I panicked.  I've been running this server on an AWS instance for many years, and have NEVER had an instance crash.  I managed to get it to reboot using the AWS EC2 console and everything was ok, but that was enough of a scare to get me to finally start backing it up.  And here we are.

Ironically, the instance crashed because I was messing around last night with a backup application Docker container and it managed to drain the server memory and bring it all down.  I noticed some memory errors in the container logs when I was setting it up, so I knew exactly what caused the problem.  The fix was simply deleting the container.  Docker is my friend.

The process isn't very complicated.  I run the script in a cron, nightly in my case, and from my home server as a non-root user.  It will automatically create snapshots for any volumes that I tell it to and will delete any snapshots older than the given retention period.

This script itself is a lightly modified version of the one in this post. The post has some manual backup methods as well and is worth reading.  In the spirit of virtualization and containerization, we'll also cover installing the AWS CLI tools in a Python Virtualenv so you don't muck up your base system.  I might cover Virtualenv in a future post, but there is a ton of information out there on it right now.

Install/Update Python and Virtualenv

I'm assuming you already have Python 3 installed.  If not, the pip installation should take care of it.

First, let's update our Ubuntu package database (this process will likely also work on a Raspberry Pi, but I haven't tested it):

sudo apt update

Install pip3 (pip for Python version 3.x):

sudo apt install python3-pip

Next, we'll install Virtualenv using pip:

sudo pip3 install virtualenv

Python Virtualenv configuration and AWS CLI tool install

As a regular user (I don't want to run this as root), we will create our Python virtual environment. I called mine cli-ve (this pathname referenced in the script, so be careful here):

virtualenv ~/cli-ve

Activate the virtual environment:

source ~/cli-ve/bin/activate

Install the AWS CLI tools and dependencies inside of our virtual environment:

pip install --upgrade awscli

Now the AWS CLI tools are installed in your Python virtual environment without impacting anything Python-related on your system. Virtual env is cool - not as cool as Docker, but still cool.

AWS Configuration

The next step will require some information from your AWS account. The needed information and steps to gather it are covered here. It's a bit of an AWS rabbit hole, but it only takes a couple of minutes to set things up.

Run the AWS configuration tool and answer the questions using the above information:

aws configure

The script

As I mentioned, I took the script from this post and made the following modifications:

  • Removed the spaces around the = symbols to make Bash happy
  • Adjusted the directories so we can run this as a normal user
  • Commented out the mail statements as I don't have a mail application on my server
  • Adjusted the path to the AWS CLI tool to reflect our Virtualenv installation

Before running the script, make sure you change the following:

  • Set the region for your AWS instance(s)
  • Set your retention days - I stuck with the default of 6
  • Create your volume list in ~/aws_backup/volumes-list with the format Volume-id:Volume-name - This information is pulled right from the AWS EC2 Dashboard under Volumes in the sidebar
  • Uncomment lines 27 and 62, and update the EMAIL_LIST variable if your server has the ability to send mail
#!/bin/bash
# Volume list file will have volume-id:Volume-name format
VOLUMES_LIST=~/aws_backup/volumes-list
SNAPSHOT_INFO=~/aws_backup/snapshot_info
DATE=`date +%Y-%m-%d`
REGION="us-east-1"
# Snapshots Retention Period for each volume snapshot
RETENTION=6
SNAP_CREATION=~/aws_backup/snap_creation
SNAP_DELETION=~/aws_backup/snap_deletion
EMAIL_LIST=user@domain.com
echo "List of Snapshots Creation Status" > $SNAP_CREATION
echo "List of Snapshots Deletion Status" > $SNAP_DELETION
# Check whether the volumes list file is available or not?
if [ -f $VOLUMES_LIST ]; then
# Creating Snapshot for each volume using for loop
for VOL_INFO in `cat $VOLUMES_LIST`
do
# Getting the Volume ID and Volume Name into the Separate Variables.
VOL_ID=`echo $VOL_INFO | awk -F":" '{print $1}'`
VOL_NAME=`echo $VOL_INFO | awk -F":" '{print $2}'`
# Creating the Snapshot of the Volumes with Proper Description.
DESCRIPTION="${VOL_NAME}_${DATE}"
~/cli-ve/bin/aws ec2 create-snapshot --volume-id $VOL_ID --description "$DESCRIPTION" --region $REGION &>> $SNAP_CREATION
done
else
# Uncomment if you have mail support on your system
# echo "Volumes list file is not available : $VOLUMES_LIST Exiting." | mail -s "Snapshots Creation Status" $EMAIL_LIST
exit 1
fi
echo >> $SNAP_CREATION
echo >> $SNAP_CREATION
# Deleting the Snapshots which are 10 days old.
for VOL_INFO in `cat $VOLUMES_LIST`
do
# Getting the Volume ID and Volume Name into the Separate Variables.
VOL_ID=`echo $VOL_INFO | awk -F":" '{print $1}'`
VOL_NAME=`echo $VOL_INFO | awk -F":" '{print $2}'`
# Getting the Snapshot details of each volume.
~/cli-ve/bin/aws ec2 describe-snapshots --query Snapshots[*].[SnapshotId,VolumeId,Description,StartTime] --output text --filters "Name=status,Values=completed" "Name=volume-id,Values=$VOL_ID" | grep -v "CreateImage" > $SNAPSHOT_INFO
# Snapshots Retention Period Checking and if it crosses delete them.
while read SNAP_INFO
do
SNAP_ID=`echo $SNAP_INFO | awk '{print $1}'`
echo $SNAP_ID
SNAP_DATE=`echo $SNAP_INFO | awk '{print $4}' | awk -F"T" '{print $1}'`
echo $SNAP_DATE
# Getting the no.of days difference between a snapshot and present day.
RETENTION_DIFF=`echo $(($(($(date -d "$DATE" "+%s") - $(date -d "$SNAP_DATE" "+%s"))) / 86400))`
echo $RETENTION_DIFF
# Deleting the Snapshots which are older than the Retention Period
if [ $RETENTION -lt $RETENTION_DIFF ];
then
~/cli-ve/bin/aws ec2 delete-snapshot --snapshot-id $SNAP_ID --region $REGION --output text> /tmp/snap_del
echo DELETING $SNAP_INFO >> $SNAP_DELETION
fi
done < $SNAPSHOT_INFO
done
echo >> $SNAP_DELETION
# Merging the Snap Creation and Deletion Data
cat $SNAP_CREATION $SNAP_DELETION > ~/aws_backup/mail_report
# Sending the mail Update - Uncomment if you have mail support on your system
# cat ~/aws_backup/mail_report | mail -s "Volume Snapshots Status" $EMAIL_LIST

After making the modifications, save the script to aws_snapshot.sh in your home directory and make it executable with chmod +x aws_snapshot.sh

You should now be able to run the script. If it's successful, you should see the snapshots being created in your AWS EC2 Dashboard under Snapshots in the sidebar.

Running the script in crontab

The modifications that we made to the script will allow it to be run from crontab, even though the AWS CLI tool lives in Virtualenv
As the normal user, load up your crontab in the editor:

crontab -e

Add the following line to the end, and edit the time/date parameters to your desired schedule:

0 2 * * * ~/aws_snapshot.sh

My entry will run the script nightly at 2 AM. More information about the crontab format can be found here.

And, that's it!

TODO:

  • Convert the script from Bash to Python.  Also, learn Python.  Also, learn to code.
  • Turn this into a Docker container as a way to learn how to create Docker containers
If you have any issues with the script or if I missed anything in the steps above, please feel free to let me know in the comments below or Tweet at me @eiddor.