How I Learned to Stop Worrying and Backup the Server

Server system administration is not my primary interest. For most of our services, we prefer to offload the headache of system administration to the fine folks at WP Engine. We utilize web services outside of WordPress, including our Help Desk and Knowledge Base running on Jira and Confluence hosted on an EC2 instance provided by Amazon Web Services.

We gain control of what we can run on the server, but the price is lost sleep because we need to manage our own backups. EC2 makes it fairly easy to do backups manually. Every EC2 instance uses a virtual disk volume with their Elastic Block Store service. The EBS volume can be backed up at a block level (like a full hard disk clone) using the Snapshots feature. Manual backups are not what we want because they still rely on one of us remembering to do it, which is error-prone and annoying.

Instead, we set up the EC2 instance to snapshot itself automatically, every day using cron, and email us a summary of the task.

Who's behind the camera?

The server may be snapshotting itself, but it requires an AWS user account with access to the server and snapshotting. In the identity management section of AWS called IAM, I added a group called "Snapshotters":

Groups in AWS IAM Console

To this group, I added an inline policy called CreateDeleteSnapshots where I define what permissions are available to users who will belong to this group:

IAM Group Policy - CreateDeleteSnapshots

The actual content of the policy is:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "Stmt1426541828000",
            "Effect": "Allow",
            "Action": [
            "Resource": [

The Action and Resource are the most important bits.  The Actions are the smallest number of permissions needed to perform snapshots automatically. The Resource is set to * enabling it to work on all servers.

I created a user called "snapshots" and added it to the Snapshotters group, so the "snapshots" user has all of those privileges I set up for the group.

AWS IAM User list of groups

Robots Taking Snapshots

It's all well and good to have a user to do the snapshotting, but we need a snapshotting mechanism too. Doc Brown isn't time traveling without the Delorean. We can create a script to take care of the snapshots for us, and then set it to run periodically using cron.

The AWS CLI tools are a great set of utilities to administer all kinds of AWS services from the command line, including snapshotting. I installed them easily on our Ubuntu server using sudo pip install awscli which places the single executable in /usr/local/bin/aws.

Breaking down the steps, we need to:

  1. Identify the current EC2 instance ID. One strategy is to use  the ec2metadata command that's built into all Ubuntu EC2 servers since Ubuntu 12.04. Here's another way to do it.
  2. Identify the volume ID of the EBS volume attached to this instance and acting as the main drive (i.e. /dev/sda1). We can use the ec2 describe-volumes command here.
  3. Create the snapshot, using the predictably-named ec2 create-snapshot command.
  4. Identify the snapshot IDs for all snapshots associated with this EBS volume and sort them by date. We can use the command ec2 describe-snapshots.
  5. We want to delete all but the 3 most recent snapshots. We can do this using ec2 delete-snapshot.

The meat of the script is:

export DATE_STR=`date +%y.%m.%d.%I`;
export INSTANCE_ID=`ec2metadata --instance-id`;
# Get the ID of the volume mounted as the root device on this instance
export VOLUME_ID=`/usr/local/bin/aws ec2 describe-volumes --filters Name=attachment.instance-id,Values=$INSTANCE_ID Name=attachment.device,Values=/dev/sda1 --query 'Volumes[*].{ID:VolumeId}' | grep ID | awk '{print $2}' | tr -d '"'`
echo "Initiating EBS volume snapshot of volume $VOLUME_ID attached to instance ID $INSTANCE_ID...";
    /usr/local/bin/aws ec2 create-snapshot --volume-id $VOLUME_ID --description $VOLUME_ID;
echo "Done.";
echo "Deleting old snapshots...";
# Get any snapshots older than the last $NUMBER_OF_SNAPSHOTS_TO_KEEP
for SNAPSHOT_ID in `/usr/local/bin/aws ec2 describe-snapshots --filters Name=volume-id,Values=$VOLUME_ID --query 'Snapshots[*].{ID:SnapshotId}' | grep ID | head -n -$NUMBER_OF_SNAPSHOTS_TO_KEEP | awk '{print $2}' | tr -d '"'` ; do
    echo "Deleting snapshot $SNAPSHOT_ID...";
    /usr/local/bin/aws ec2 delete-snapshot --snapshot-id $SNAPSHOT_ID;
echo "Done.";

In order for this to work in the context of a cron job, we need to set the environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION (this last one because our EC2 region, us-west-2, is different from the default of us-east-1) according to this AWS CLI guide. I also need to set the PATH to include /usr/local/bin, the location of the aws command.

Scheduling for Peace of Mind

This script I saved in a folder full of cron scripts in our home folder, /home/newsapps/cron/ and made it executable with chmod +x ~/cron/ I scheduled this with cron to run every day at midnight server time using crontab -e and by adding the following lines (this crontab generator helped greatly):

# EC2 EBS Snapshot -- run once a day
0 0 * * * /home/newsapps/cron/

The extra MAILTO= is what emails us the output of the script. The only trick with getting that to work is that I had to install a mail server. I innocently installed the mail program by doing sudo apt-get install mail and in the process installed the mail server postfix and configured for our Fully-Qualified Domain Name

iPhone Screenshot of email from cron job

Et voilà! I can check from my phone that the server backed itself up, and I can go run carefree through a field full of daisies in my dreams.