Tag: S3

Previously, I was backing up my websites password-less ssh and rsync, but I have decided to change that for a couple of reasons such as passwordless ssh can have its security problems, and I dont trust hardware. Using S3 I can securely keep as many full backups as I’d like, not have to worry about hardware failing, all for pennies a month. Currently I backup my desktop with JungleDisk and thought I may as well consolidate all my backups to S3. 

To accomplish this goal of backing up your website you will need 2 things, SSH access and s3sync which is a handy little S3 client written in ruby.

  1. Download and setup s3sync from http://s3sync.net/wiki. Once you untar the file open up s3config.yml (may be named s3config.sample.yml, just rename it) and this is where you set the path to your certficates, youAWS access, secret access keys, these can be found in “Access Identifiers” section of your AWS account.
  2. Now lets test this out. S3sync needs to know where its configuration file is so we just export the variable S3CONF with the path to your s3config.yml file.
    [user@machine s3sync] export S3CONF=/home/bront1/s3sync

    To make sure everything is working do a simple command to list your buckets.

    [user@machine s3sync]$ ./s3cmd.rb listbuckets
    bucket1
    bucket2
    bucket3

    Yay we have s3 connectivity!

  3. Next we dump the databases, tar and gzip everything up.
    [user@machine] mysqldump -u user -pPassword --all-databases > /home/user/backup.sql
    [user@machine] tar -cf /home/user/backup.tar /home/user/public_html /home/user/backup.sql | gzip > /home/usr/backup.tar
  4. Copy the backup to your S3 bucket using s3sync’s s3cmd.rb
    [user@machine s3sync] ./s3cmd.rb put bucket-name:folder/target_name.tar.gz /home/user/backup.tar.gz
  5. Success! You have now backed up your website and database to your S3 account.

S3sync has its own sort of built in rsync to make incremental backups, but I prefer to make keep my own daily backups. Heres the script I use to create daily backups of my websites and files. It creates a backup with day, month, and year in the filename and keeps backups for the last 10 days or so. (Note: The sed and awk commands are really messy due to my lack of sed/awk knowledge. This script is run everyday at 3am using a cron job with the output written to a log.

Crontab:

0 3 * * * export S3CONF=/home/user/s3sync; /home/user/backup.sh >> /home/user/backup.log

backup.sh

#!/bin/bash
 
export S3CONF=/home/user/s3sync
 
cd /home/user
 
TIMESTAMP=`date +%m%d%Y`
 
echo "$TIMESTAMP :: Backuping up the databases"
mysqldump -u user -pPassword --all-databases > /home/user/backup.sql
 
echo "$TIMESTAMP :: Bundling all the files up"
 
tar -cf /home/user/backup_`date +%m%d%Y`.tar public_html backup.sql
gzip -f /home/user/backup_`date +%m%d%Y`.tar
 
echo "$TIMESTAMP :: Copying backup to S3"
 
#we use full path because this script is running in a cron job
/usr/local/bin/ruby /home/user/s3sync/s3cmd.rb put bucket:folder/backup_`date +%m%d%Y`.tar.gz /home/user/backup_`date +%m%d%Y`.tar.gz
 
echo "$TIMESTAMP :: Cleaning up"
rm -f /home/user/backup_`date +%m%d%Y`.tar.gz
rm -f /home/user/backup.sql
 
echo "$TIMESTAMP :: Checking for old backups"
 
#check how many backups are saved
num=`/usr/local/bin/ruby /home/user/s3sync/s3cmd.rb list bucket:folder | wc -l`
 
#we save at least 10 days of backups
#13 is checked for due to other crap s3cmd prints out
if [ "$num" == "13" ]; then
  echo "$TIMESTAMP :: Deleting old backup"
 
  #i know there is a better way to check this, i just dont know how
  last=`/usr/local/bin/ruby /home/user/s3sync/s3cmd.rb list bucket:folder | sed -e 's/-//g' | awk '{printf("%s", $0 (NR==1 ? "" : " "))}' | awk '{print $2}'`
 
  /usr/local/bin/ruby /home/user/s3sync/s3cmd.rb delete bucket:$last
else
  echo "$TIMESTAMP :: No old backup to delete"
fi
 
echo "$TIMESTAMP :: Done"

S3 is Amazon web service’s cloud storage solution offering very cheap price-per-gb storage and bandwidth which makes this a very cost effective backup solution. Currently I use JungleDisk for mounting S3 buckets as drives and to schedule backups, and s3sync to backup my files in various unix enviroments but I dont have anything in between to where I can easily manage my S3 files. That is where Cloudberry Explorer for Amazon S3 comes in. I have been using this for the past couple of hours or so and to say the least I am impressed. 

Here is what I’ve noticed while using Cloudberry. cloudberrymain

  • First off, this software is completely free. That is a huge plus because what I was using before, CrossFTP, you had to pay 25$ for the pro version to get S3 support and I had a whole slew of gripes about it.
  • The UI is very clean and responsive. 
  • Tabbed! Everything is better with tabs, especially if you are dealing with multiple uploads and/or multiple S3 accounts
  • Very easy to upload and manage files with the intent of serving them on the web
  • Cloudfront integration
  • The ability to bill the requestor for a file for the S3 usage. I messed around with this for a bit and I think its just a flag that is set on the file? Not sure how to implement it.
  • Plugins to Microsoft Powershell

Though through using this there are a couple things that could be added that would be nice.

  • In the “MyComputer” pane, shortcuts to Desktop, My Documents, etc would be nice.
  • Also labels for each of the drives I have. I have so many drive labels I often forget which one is a HDD, network drive, or DVD drive.
  • Refresh the S3 pane after I copy a file to a bucket so I can see that file there after it is finished copying.

Overall, this is a pretty solid S3 client and I plan to keep on using it. 

Cloudberry Explorer for Amazon S3