Exploring Riak

Been playing with Riak recently, which is one of the modern dynamo-derived nosql databases (the other main ones being Cassandra and Voldemort). We're evaluating it for use as a really large brackup datastore, the primary attraction being the near linear scalability available by adding (relatively cheap) new nodes to the cluster, and decent availability options in the face of node failures.

I've built riak packages for RHEL/CentOS 5, available at my repository, and added support for a riak 'target' to the latest version (1.10) of brackup (packages also available at my repo).

The first thing to figure out is the maximum number of nodes you expect your riak cluster to get to. This you use to size the ring_creation_size setting, which is the number of partitions the hash space is divided into. It must be a power of 2 (64, 128, 256, etc.), and the reason it's important is that it cannot be easily changed after the cluster has been created. The rule of thumb is that for performance you want at least 10 partitions per node/machine, so the default ring_creation_size of 64 is really only useful up to about 6 nodes. 128 scales to 10-12, 256 to 20-25, etc. For more info see the Riak Wiki.

Here's the script I use for configuring a new node on CentOS. The main things to tweak here are the ring_creation_size you want (here I'm using 512, for a biggish cluster), and the interface to use to get the default ip address (here eth0, or you could just hardcode 0.0.0.0 instead of $ip).

#!/bin/sh
# Riak configuration script for CentOS/RHEL

# Install riak (and IO::Interface, for next)
yum -y install riak perl-IO-Interface

# To set app.config:web_ip to use primary ip, do:
perl -MIO::Interface::Simple -i \
  -pe "BEGIN { \$ip = IO::Interface::Simple->new(q/eth0/)->address; }
      s/127\.0\.0\.1/\$ip/" /etc/riak/app.config

# To add a ring_creation_size clause to app.config, do:
perl -i \
  -pe 's/^((\s*)%% riak_web_ip)/$2%% ring_creation_size is the no. of partitions to divide the hash
$2%% space into (default: 64).
$2\{ring_creation_size, 512\},

$1/' /etc/riak/app.config

# To set riak vm_args:name to hostname do:
perl -MSys::Hostname -i -pe 's/127\.0\.0\.1/hostname/e' /etc/riak/vm.args

# display (bits of) config files for checking
echo
echo '********************'
echo /etc/riak/app.config
echo '********************'
head -n30 /etc/riak/app.config
echo
echo '********************'
echo /etc/riak/vm.args
echo '********************'
cat /etc/riak/vm.args

Save this to a file called e.g. riak_configure, and then to configure a couple of nodes you do the following (note that NODE is any old internal hostname you use to ssh to the host in question, but FIRST_NODE needs to use the actual -name parameter defined in /etc/riak/vm.args on your first node):

# First node
NODE=node1
cat riak_configure | ssh $NODE sh
ssh $NODE 'chkconfig riak on; service riak start'
# Run the following until ringready reports TRUE
ssh $NODE riak-admin ringready

# All nodes after the first
FIRST_NODE=riak@node1.example.com
NODE=node2
cat riak_configure | ssh $NODE sh
ssh $NODE "chkconfig riak on; service riak start && riak-admin join $FIRST_NODE"
# Run the following until ringready reports TRUE
ssh $NODE riak-admin ringready

That's it. You should now have a working riak cluster accessible on port 8098 on your cluster nodes.

Backup Regimes with Brackup

After using brackup for a while you find you have a big list of backups sitting on your server, and start to think about cleaning up some of the older ones. The standard brackup tool for this is brackup-target, and the prune and gc (garbage collection) subcommands.

Typical usage is something like this:

# List the backups for a particular target on the server e.g.
TARGET=myserver_images
brackup-target $TARGET list-backups
Backup File                      Backup Date                      Size (B)
-----------                      -----------                      --------
images-1262106544                Thu 31 Dec 2009 03:32:49          1263128
images-1260632447                Sun 13 Dec 2009 08:19:13          1168281
images-1250042378                Wed 25 Nov 2009 06:25:06           977464
images-1239323644                Mon 09 Nov 2009 00:30:34           846523
images-1239577352                Thu 29 Oct 2009 13:03:02           846523
...

# Decide how many backups you want to keep, and prune (delete) the rest
brackup-target --keep-backups 15 $TARGET prune

# Prune just removes the brackup files on the server, so now you need to
# run a garbage collect to delete any 'chunks' that are now orphaned
brackup-target --interactive $TARGET gc

This simple scheme - "keep the last N backups" - works pretty nicely for backups you do relatively infrequently. If you do more frequent backups, however, you might find yourself wanting to be able to implement more sophisticated retention policies. Traditional backup regimes often involve policies like this:

  • keep the last 2 weeks of daily backups
  • keep the last 8 weekly backups
  • keep monthly backups forever

It's not necessarily obvious how to do something like this with brackup, but it's actually pretty straightforward. The trick is to define multiple 'sources' in your brackup.conf, one for each backup 'level' you want to use. For instance, to implement the regime above, you might define the following:

# Daily backups
[SOURCE:images]
path = /data/images
...

# Weekly backups
[SOURCE:images-weekly]
path = /data/images
...

# Monthly backups
[SOURCE:images-monthly]
path = /data/images
...

You'd then use the images-monthly source once a month, the images-weekly source once a week, and the images source the rest of the time. Your list of backups would then look something like this:

Backup File                      Backup Date                      Size (B)
-----------                      -----------                      --------
images-1234567899                Sat 05 Dec 2009 03:32:49          1263128
images-1234567898                Fri 04 Dec 2009 03:19:13          1168281
images-1234567897                Thu 03 Dec 2009 03:19:13          1168281
images-1234567896                Wed 02 Dec 2009 03:19:13          1168281
images-monthly-1234567895        Tue 01 Dec 2009 03:19:13          1168281
images-1234567894                Mon 30 Nov 2009 03:19:13          1168281
images-weekly-1234567893         Sun 29 Nov 2009 03:19:13          1168281
images-1234567892                Sat 28 Nov 2009 03:25:06           977464
...

And when you prune, you want to specify a --source argument, and specify separate --keep-backups settings for each level e.g. for the above:

# Keep 2 weeks worth of daily backups
brackup-target --source images --keep-backups 12 $TARGET prune

# Keep 8 weeks worth of weekly backups
brackup-target --source images-weekly --keep-backups 8 $TARGET prune

# Keep all monthly backups, so we don't prune them at all

# And then garbage collect as normal
brackup-target --interactive $TARGET gc

Brackup Tips and Tricks

Further to my earlier post, I've spent a good chunk of time implementing brackup over the last few weeks, both at home for my personal backups, and at $work on some really large trees. There are a few gotchas along the way, so thought I'd document some of them here.

Active Filesystems

First, as soon as you start trying to brackup trees on any size you find that brackup aborts if it finds a file has changed between the time it initially walks the tree and when it comes to back it up. On an active filesystem this can happen pretty quickly.

This is arguably reasonable behaviour on brackup's part, but it gets annoying pretty fast. The cleanest solution is to use some kind of filesystem snapshot to ensure you're backing up a consistent view of your data and a quiescent filesystem.

I'm using linux and LVM, so I'm using LVM snapshots for this, using something like:

SIZE=250G
VG=VolGroup00
PART=${1:-export}

mkdir -p /${PART}_snap
lvcreate -L$SIZE --snapshot --permission r -n ${PART}_snap /dev/$VG/$PART && \
  mount -o ro /dev/$VG/${PART}_snap /${PART}_snap

which snapshots /dev/VolGroup00/export to /dev/VolGroup00/export_snap, and mounts the snapshot read-only on /export_snap.

The reverse, post-backup, is similar:

VG=VolGroup00
PART=${1:-export}

umount /${PART}_snap && \
  lvremove -f /dev/$VG/${PART}_snap

which unmounts the snapshot and then deletes it.

You can then do your backup using the /${PART}_snap tree instead of your original ${PART} one.

Brackup Digests

So snapshots works nicely. Next wrinkle is that by default brackup writes its digest cache file to the root of your source tree, which in this case is readonly. So you want to tell brackup to put that in the original tree, not the snapshot, which you do in the your ~/.brackup.conf file e.g.

[SOURCE:home]
path = /export_snap/home
digestdb_file = /exportb/home/.brackup-digest.db
ignore = \.brackup-digest.db$

I've also added an explicit ignore rule for these digest files here. You don't really need to back these up as they're just caches, and they can get pretty large. Brackup automatically skips the digestdb_file for you, but it doesn't skip any others you might have, if for instance you're backing up the same tree to multiple targets.

Synching Backups Between Targets

Another nice hack you can do with brackup is sync backups on filesystem-based targets (that is, Target::Filesystem, Target::Ftp, and Target::Sftp) between systems. For instance, I did my initial home directory backup of about 10GB onto my laptop, and then carried my laptop into where my server is located, and then rsync-ed the backup from my laptop to the server. Much faster than copying 10GB of data over an ADSL line!

Similarly, at $work I'm doing brackups onto a local backup server on the LAN, and then rsyncing the brackup tree to an offsite server for disaster recovery purposes.

There are a few gotchas when doing this, though. One is that Target::Filesystem backups default to using colons in their chunk file names on Unix-like filesystems (for backwards-compatibility reasons), while Target::Ftp and Target::Sftp ones don't. The safest thing to do is just to turn off colons altogether on Filesystem targets:

[TARGET:server_fs_home]
type = Filesystem
path = /export/brackup/nox/home
no_filename_colons = 1

Second, brackup uses a local inventory database to avoid some remote filesystem checks to improve performance, so that if you replicate a backup onto another target you also need to make a copy of the inventory database so that brackup knows which chunks are already on your new target.

The inventory database defaults to $HOME/.brackup-target-TARGETNAME.invdb (see perldoc Brackup::InventoryDatabase), so something like the following is usually sufficient:

cp $HOME/.brackup-target-OLDTARGET.invdb $HOME/.brackup-target-NEWTARGET.invdb

Third, if you want to do a restore using a brackup file (the SOURCE-DATE.brackup output file brackup produces) from a different target, you typically need to make a copy and then update the header portion for the target type and host/path details of your new target. Assuming you do that and your new target has all the same chunks, though, restores work just fine.

Fun with brackup

I've been playing around with Brad Fitzpatrick's brackup for the last couple of weeks. It's a backup tool that "slices, dices, encrypts, and sprays across the net" - notably to Amazon S3, but also to filesystems (local or networked), FTP servers, or SSH/SFTP servers.

I'm using it to backup my home directories and all my image and music files both to a linux server I have available in a data centre (via SFTP) and to Amazon S3.

brackup's a bit rough around the edges and could do with some better documentation and some optimisation, but it's pretty useful as it stands. Here are a few notes and tips from my playing so far, to save others a bit of time.

Version: as I write the latest version on CPAN is 1.06, but that's pretty old - you really want to use the current subversion trunk instead. Installation is the standard perl module incantation e.g.

# Checkout from svn or whatever
cd brackup
perl Makefile.PL
make
make test
sudo make install

Basic usage is as follows:

# First-time through (on linux, in my case):
cd
mkdir brackup
cd brackup
brackup
Error: Your config file needs tweaking. I put a commented-out template at:
  /home/gavin/.brackup.conf

# Edit the vanilla .brackup.conf that was created for you.
# You want to setup at least one SOURCE and one TARGET section initially,
# and probably try something smallish i.e. not your 50GB music collection!
# The Filesystem target is probably the best one to try out first.
# See '`perldoc Brackup::Root`' and '`perldoc Brackup::Target`' for examples
$EDITOR ~/.brackup.conf

# Now run your first backup changing SOURCE and TARGET below to the names
# you used in your .brackup.conf file
brackup -v --from=SOURCE --to=TARGET

# You can also do a dry run to see what brackup's going to do (undocumented)
brackup -v --from=SOURCE --to=TARGET --dry-run

If all goes well you should get some fairly verbose output about all the files in your SOURCE tree that are being backed up for you, and finally a brackup output file (typically named SOURCE-DATE.brackup) should be written to your current directory. You'll need this brackup file to do your restores, but it's also stored on the target along with your backup, so you can also retrieve it from there (using brackup-target, below) if your local copy gets lost, or if you need to restore to somewhere else.

Restores reference that SOURCE-DATE.brackup file you just created:

# Create somewhere to restore to
mkdir -p /tmp/brackup-restore/full

# Restore the full tree you just backed up
brackup-restore -v --from=SOURCE-DATE.brackup --to=/tmp/brackup-restore/full --full

# Or restore just a subset of the tree
brackup-restore -v --from=SOURCE-DATE.brackup --to=/tmp/brackup-restore --just=DIR
brackup-restore -v --from=SOURCE-DATE.brackup --to=/tmp/brackup-restore --just=FILE

You can also use the brackup-target utility to query a target for the backups it has available, and do various kinds of cleanup:

# List the backups available on the given target
brackup-target TARGET list_backups

# Get the brackup output file for a specific backup (to restore)
brackup-target TARGET get_backup BACKUPFILE

# Delete a brackup file on the target
brackup-target TARGET delete_backup BACKUPFILE

# Prune the target to the most recent N backup files
brackup-target --keep-backups 15 TARGET prune

# Remove backup chunks no longer referenced by any backup file
brackup-target TARGET gc

That should be enough to get you up and running with brackup - I'll cover some additional tips and tricks in a subsequent post.