Rebuild Inventory

Here's what I use to take a quick inventory of a machine before a rebuild, both to act as a reference during the rebuild itself, and in case something goes pear-shaped. The whole chunk after script up to exit is cut-and-pastable.

# as root, where you want your inventory file
script $(hostname).inventory
export PS1='\h:\w\$ '               # reset prompt to avoid ctrl chars
fdisk -l /dev/sd?                   # list partition tables
cat /proc/mdstat                    # list raid devices
pvs                                 # list lvm stuff
vgs
lvs
df -h                               # list mounts
ip addr                             # list network interfaces
ip route                            # list network routes
cat /etc/resolv.conf                # show resolv.conf
exit

# Cleanup control characters in the inventory
perl -i -pe 's/\r//g; s/\033\]\d+;//g; s/\033\[\d+m//g; s/\007/\//g' \
  $(hostname).inventory

# And then copy it somewhere else in case of problems ;-)
scp $(hostname).inventory somewhere:

Anything else useful I've missed?

Cronologue

Came across cronologger (blog post) recently (via Dean Wilson), which is a simple wrapper script you use around your cron(8) jobs, which captures any stdout and stderr output and logs it to a couchdb database, instead of the traditional behaviour of sending it to you as email.

It's a nice idea, particularly for jobs with important output where it would be nice to able to look back in time more easily than by trawling through a noisy inbox, or for sites with lots of cron jobs where the sheer volume is difficult to handle usefully as email.

Cronologger comes with a simple web interface for displaying your cron jobs, but so far it's pretty rudimentary. I quickly realised that this was another place (cf. blosxom4nagios) where blosxom could be used to provide a pretty useful gui with very little work.

Thus: cronologue.

cronologue(1) is the wrapper, written in perl, which logs job records and and stdout/stderr output via standard HTTP PUTs back to a designated apache server, as flat text files. Parameters can be used to control whether job records are always created, or only when there is output produced. There's also a --passthru mode in which stdout and stderr streams are still output, allowing both email and cronologue output to be produced.

On the server side a custom blosxom install is used to display the job records, which can be filtered by hostname or by date. There's also an RSS feed available.

Obligatory screenshot:

Cronologue GUI

Update: I should add that RPMs for CentOS5 (but which will probably work on most RPM-based distros) are available from my yum repository.

Parallel Processing Perl Modules

Needed to parallelise some processing in perl the last few days, and did a quick survey of some of the parallel processing modules on CPAN, of which there is the normal bewildering diversity.

As usual, it depends exactly what you're trying to do. In my case I just needed to be able to fork a bunch of processes off, have them process some data, and hand the results back to the parent.

So here are my notes on a random selection of the available modules. The example each time is basically a parallel version of the following map:

my %out = map { $_ ** 2 } 1 .. 50;

Parallel::ForkManager

Object oriented wrapper around 'fork'. Supports parent callbacks. Passing data back to parent uses files, and feels a little bit clunky. Dependencies: none.

use Parallel::ForkManager 0.7.6;

my @num = 1 .. 50;

my $pm = Parallel::ForkManager->new(5);

my %out;
$pm->run_on_finish(sub {    # must be declared before first 'start'
    my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data) = @_;
    $out{ $data->[0] } = $data->[1];
});

for my $num (@num) {
    $pm->start and next;   # Parent nexts

    # Child
    my $sq = $num ** 2;

    $pm->finish(0, [ $num, $sq ]);   # Child exits
}
$pm->wait_all_children;

[Version 0.7.9]

Parallel::Iterator

Basically a parallel version of 'map'. Dependencies: none.

use Parallel::Iterator qw(iterate);

my @num = 1 .. 50;

my $it = iterate( sub {
    # sub is a closure, return outputs
    my ($id, $num) = @_;
    return $num ** 2;
}, \@num );

my %out = ();
while (my ($num, $square) = $it->()) {
  $out{$num} = $square;
}

[Version 1.00]

Parallel::Loops

Provides parallel versions of 'foreach' and 'while'. It uses 'tie' to allow shared data structures between the parent and children. Dependencies: Parallel::ForkManager.

use Parallel::Loops;

my @num = 1 .. 50;

my $pl = Parallel::Loops->new(5);

my %out;
$pl->share(\%out);

$pl->foreach( \@num, sub {
    my $num = $_;           # note this uses $_, not @_
    $out{$num} = $num ** 2;
});

You can also return values from the subroutine like Iterator, avoiding the explicit 'share':

my %out = $pl->foreach( \@num, sub {
    my $num = $_;           # note this uses $_, not @_
    return ( $num, $num ** 2 );
});

[Version 0.03]

Proc::Fork

Provides an interesting perlish forking interface using blocks. No built-in support for returning data from children, but provides examples using pipes. Dependencies: Exporter::Tidy.

use Proc::Fork;
use IO::Pipe;
use Storable qw(freeze thaw);

my @num = 1 .. 50;
my @children;

for my $num (@num) {
    my $pipe = IO::Pipe->new;

    run_fork{ child {
        # Child
        $pipe->writer;
        print $pipe freeze([ $num, $num ** 2 ]);
        exit;
    } };

    # Parent
    $pipe->reader;
    push @children, $pipe;
}

my %out;
for my $pipe (@children) {
    my $entry = thaw( <$pipe> );
    $out{ $entry->[0] } = $entry->[1];
}

[Version 0.71]

Parallel::Prefork

Like Parallel::ForkManager, but adds better signal handling. Doesn't seem to provide built-in support for returning data from children. Dependencies: Proc::Wait3.

[Version 0.08]

Parallel::Forker

More complex module, loosely based on ForkManager (?). Includes better signal handling, and supports scheduling and dependencies between different groups of subprocesses. Doesn't appear to provide built-in support for passing data back from children.

[Version 1.232]

Exploring Riak

Been playing with Riak recently, which is one of the modern dynamo-derived nosql databases (the other main ones being Cassandra and Voldemort). We're evaluating it for use as a really large brackup datastore, the primary attraction being the near linear scalability available by adding (relatively cheap) new nodes to the cluster, and decent availability options in the face of node failures.

I've built riak packages for RHEL/CentOS 5, available at my repository, and added support for a riak 'target' to the latest version (1.10) of brackup (packages also available at my repo).

The first thing to figure out is the maximum number of nodes you expect your riak cluster to get to. This you use to size the ring_creation_size setting, which is the number of partitions the hash space is divided into. It must be a power of 2 (64, 128, 256, etc.), and the reason it's important is that it cannot be easily changed after the cluster has been created. The rule of thumb is that for performance you want at least 10 partitions per node/machine, so the default ring_creation_size of 64 is really only useful up to about 6 nodes. 128 scales to 10-12, 256 to 20-25, etc. For more info see the Riak Wiki.

Here's the script I use for configuring a new node on CentOS. The main things to tweak here are the ring_creation_size you want (here I'm using 512, for a biggish cluster), and the interface to use to get the default ip address (here eth0, or you could just hardcode 0.0.0.0 instead of $ip).

#!/bin/sh
# Riak configuration script for CentOS/RHEL

# Install riak (and IO::Interface, for next)
yum -y install riak perl-IO-Interface

# To set app.config:web_ip to use primary ip, do:
perl -MIO::Interface::Simple -i \
  -pe "BEGIN { \$ip = IO::Interface::Simple->new(q/eth0/)->address; }
      s/127\.0\.0\.1/\$ip/" /etc/riak/app.config

# To add a ring_creation_size clause to app.config, do:
perl -i \
  -pe 's/^((\s*)%% riak_web_ip)/$2%% ring_creation_size is the no. of partitions to divide the hash
$2%% space into (default: 64).
$2\{ring_creation_size, 512\},

$1/' /etc/riak/app.config

# To set riak vm_args:name to hostname do:
perl -MSys::Hostname -i -pe 's/127\.0\.0\.1/hostname/e' /etc/riak/vm.args

# display (bits of) config files for checking
echo
echo '********************'
echo /etc/riak/app.config
echo '********************'
head -n30 /etc/riak/app.config
echo
echo '********************'
echo /etc/riak/vm.args
echo '********************'
cat /etc/riak/vm.args

Save this to a file called e.g. riak_configure, and then to configure a couple of nodes you do the following (note that NODE is any old internal hostname you use to ssh to the host in question, but FIRST_NODE needs to use the actual -name parameter defined in /etc/riak/vm.args on your first node):

# First node
NODE=node1
cat riak_configure | ssh $NODE sh
ssh $NODE 'chkconfig riak on; service riak start'
# Run the following until ringready reports TRUE
ssh $NODE riak-admin ringready

# All nodes after the first
FIRST_NODE=riak@node1.example.com
NODE=node2
cat riak_configure | ssh $NODE sh
ssh $NODE "chkconfig riak on; service riak start && riak-admin join $FIRST_NODE"
# Run the following until ringready reports TRUE
ssh $NODE riak-admin ringready

That's it. You should now have a working riak cluster accessible on port 8098 on your cluster nodes.

Remote Rebuild, CentOS-style

Problem: you've got a remote server that's significantly hosed, either through a screwup somewhere or a power outage that did nasty things to your root filesystem or something. You have no available remote hands, and/or no boot media anyway.

Preconditions: You have another server you can access on the same network segment, and remote access to the broken server, either through a DRAC or iLO type card, or through some kind of serial console server (like a Cyclades/Avocent box).

Solution: in extremis, you can do a remote rebuild. Here's the simplest recipe I've come up with. I'm rebuilding using centos5-x86_64 version 5.5; adjust as necessary.

Note: dnsmasq, mrepo and syslinux are not core CentOS packages, so you need to enable the rpmforge repository to follow this recipe. This just involves:

wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
rpm -Uvh rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm

1. On your working box (which you're now going to press into service as a build server), install and configure dnsmasq to provide dhcp and tftp services:

# Install dnsmasq
yum install dnsmasq

# Add the following lines to the bottom of your /etc/dnsmasq.conf file
# Note that we don't use the following ip address, but the directive
# itself is required for dnsmasq to turn dhcp functionality on
dhcp-range=ignore,192.168.1.99,192.168.1.99
# Here use the broken server's mac addr, hostname, and ip address
dhcp-host=00:d0:68:09:19:80,broken.example.com,192.168.1.5,net:centos5x
# Point the centos5x tag at the tftpboot environment you're going to setup
dhcp-boot=net:centos5x,/centos5x-x86_64/pxelinux.0
# And enable tftp
enable-tftp
tftp-root = /tftpboot
#log-dhcp

# Then start up dnsmasq
service dnsmasq start

2. Install and configure mrepo to provide your CentOS build environment:

# Install mrepo and syslinux
yum install mrepo syslinux

# Setup a minimal /etc/mrepo.conf e.g.
cat > /etc/mrepo.conf
[main]
srcdir = /var/mrepo
wwwdir = /var/www/mrepo
confdir = /etc/mrepo.conf.d
arch = x86_64
mailto = root@example.com
smtp-server = localhost
pxelinux = /usr/lib/syslinux/pxelinux.0
tftpdir = /tftpboot

[centos5]
release = 5
arch = x86_64
metadata = repomd repoview
name = Centos-$release $arch
#iso = CentOS-$release.5-$arch-bin-DVD-?of2.iso
#iso = CentOS-$release.5-$arch-bin-?of8.iso
^D
# (uncomment one of the iso lines above, either the DVD or the CD one)

# Download the set of DVD or CD ISOs for the CentOS version you want
# There are fewer DVD ISOs, but you need to use bittorrent to download
mkdir -p /var/mrepo/iso
cd /var/mrepo/iso
elinks http://isoredirect.centos.org/centos/5.5/isos/x86_64/

# Once your ISOs are available in /var/mrepo/iso, and the 'iso' line
# in /etc/mrepo.conf updated appropriately, run mrepo itself
mrepo -gvv

3. Finally, finish setting up your tftp environment. mrepo should have copied appropriate pxelinux.0, initrd.img, and vmlinuz files into your /tftpboot/centos5-x86_64 directory, so all you need to supply is an appropriate grub boot config:

cd /tftpboot/centos5-x86_64
ls
mkdir -p pxelinux.cfg

# Setup a default grub config (adjust the serial/console and repo params as needed)
cat > pxelinux.cfg/default
default linux
serial 0,9600n8
label linux
  root (nd)
  kernel vmlinuz
  append initrd=initrd.img console=ttyS0,9600 repo=http://192.168.1.1/mrepo/centos5-x86_64
^D

Now get your server to do a PXE boot (via a boot option or the bios or whatever), and hopefully your broken server will find your dhcp/tftp environment and boot up in install mode, and away you go.

If you have problems with the boot, try checking your /var/log/messages file on the boot server for hints.

Dell OMSA

Following on from my IPMI explorations, here's the next chapter in my getting-down-and-dirty-with-dell-hardware-on-linux adventures. This time I'm setting up Dell's OpenManage Server Administrator software, primarily in order to explore being able to configure bios settings from within the OS. As before, I'm running CentOS 5, but OMSA supports any of RHEL4, RHEL5, SLES9, and SLES10, and various versions of Fedora Core and OpenSUSE.

Here's what I did to get up and running:

# Configure the Dell OMSA repository
wget -O bootstrap.sh http://linux.dell.com/repo/hardware/latest/bootstrap.cgi
# Review the script to make sure you trust it, and then run it
sh bootstrap.sh
# OR, for CentOS5/RHEL5 x86_64 you can just install the following:
rpm -Uvh http://linux.dell.com/repo/hardware/latest/platform_independent/rh50_64/prereq/\
dell-omsa-repository-2-5.noarch.rpm

# Install base version of OMSA, without gui (install srvadmin-all for more)
yum install srvadmin-base

# One of daemons requires /usr/bin/lockfile, so make sure you've got procmail installed
yum install procmail

# If you're running an x86_64 OS, there are a couple of additional 32-bit
#   libraries you need that aren't dependencies in the RPMs
yum install compat-libstdc++-33-3.2.3-61.i386 pam.i386

# Start OMSA daemons
for i in instsvcdrv dataeng dsm_om_shrsvc; do service $i start; done

# Finally, you can update your path by doing logout/login, or just run:
. /etc/profile.d/srvadmin-path.sh

Now to check whether you're actually functional you can try a few of the following (as root):

omconfig about
omreport about
omreport system -?
omreport chassis -?

omreport is the OMSA CLI reporting/query tool, and omconfig is the equivalent update tool. The main documentation for the current version of OMSA is here. I found the CLI User's Guide the most useful.

Here's a sampler of interesting things to try:

# Report system overview
omreport chassis

# Report system summary info (OS, CPUs, memory, PCIe slots, DRAC cards, NICs)
omreport system summary

# Report bios settings
omreport chassis biossetup

# Fan info
omreport chassis fans

# Temperature info
omreport chassis temps

# CPU info
omreport chassis processors

# Memory and memory slot info
omreport chassis memory

# Power supply info
omreport chassis pwrsupplies

# Detailed PCIe slot info
omreport chassis slots

# DRAC card info
omreport chassis remoteaccess

omconfig allows setting object attributes using a key=value syntax, which can get reasonably complex. See the CLI User's Guide above for details, but here are some examples of messing with various bios settings:

# See available attributes and settings
omconfig chassis biossetup -?

# Turn the AC Power Recovery setting to On
omconfig chassis biossetup attribute=acpwrrecovery setting=on

# Change the serial communications setting (on with serial redirection via)
omconfig chassis biossetup attribute=serialcom setting=com1
omconfig chassis biossetup attribute=serialcom setting=com2

# Change the external serial connector
omconfig chassis biossetup attribute=extserial setting=com1
omconfig chassis biossetup attribute=extserial setting=rad

# Change the Console Redirect After Boot (crab) setting
omconfig chassis biossetup attribute=crab setting=enabled
omconfig chassis biossetup attribute=crab setting=disabled

# Change NIC settings (turn on PXE on NIC1)
omconfig chassis biossetup attribute=nic1 setting=enabledwithpxe

Finally, there are some interesting formatting options available to both omreport, for use in scripting e.g.

# Custom delimiter format (default semicolon)
omreport chassis -fmt cdv

# XML format
omreport chassis -fmt xml

# To change the default cdv delimiter
omconfig preferences cdvformat -?
omconfig preferences cdvformat delimiter=pipe

IPMI on CentOS/RHEL

Spent a few days deep in the bowels of a couple of datacentres last week, and realised I didn't know enough about Dell's DRAC base management controllers to use them properly. In particular, I didn't know how to mess with the drac settings from within the OS. So spent some of today researching that.

Turns out there are a couple of routes to do this. You can use the Dell native tools (e.g. racadm) included in Dell's OMSA product, or you can use vendor-neutral IPMI, which is well-supported by Dell DRACs. I went with the latter as it's more cross-platform, and the tools come native with CentOS, instead of having to setup Dell's OMSA repositories. The Dell-native tools may give you more functionality, but for what I wanted to do IPMI seems to work just fine.

So installation is just:

yum install OpenIPMI OpenIPMI-tools
chkconfig ipmi on
service ipmi start

and then from the local machine you can use ipmitool to access and manipulate all kinds of useful stuff:

# IPMI commands
ipmitool help
man ipmitool

# To check firmware version
ipmitool mc info
# To reset the management controller
ipmitool mc reset [ warm | cold ]

# Show field-replaceable-unit details
ipmitool fru print

# Show sensor output
ipmitool sdr list
ipmitool sdr type list
ipmitool sdr type Temperature
ipmitool sdr type Fan
ipmitool sdr type 'Power Supply'

# Chassis commands
ipmitool chassis status
ipmitool chassis identify [<interval>]   # turn on front panel identify light (default 15s)
ipmitool [chassis] power soft            # initiate a soft-shutdown via acpi
ipmitool [chassis] power cycle           # issue a hard power off, wait 1s, power on
ipmitool [chassis] power off             # issue a hard power off
ipmitool [chassis] power on              # issue a hard power on
ipmitool [chassis] power reset           # issue a hard reset

# Modify boot device for next reboot
ipmitool chassis bootdev pxe
ipmitool chassis bootdev cdrom
ipmitool chassis bootdev bios

# Logging
ipmitool sel info
ipmitool sel list
ipmitool sel elist                       # extended list (see manpage)
ipmitool sel clear

For remote access, you need to setup user and network settings, either at boot time on the DRAC card itself, or from the OS via ipmitool:

# Display/reset password for default root user (userid '2')
ipmitool user list 1
ipmitool user set password 2 <new_password>

# Display/configure lan settings
ipmitool lan print 1
ipmitool lan set 1 ipsrc [ static | dhcp ]
ipmitool lan set 1 ipaddr 192.168.1.101
ipmitool lan set 1 netmask 255.255.255.0
ipmitool lan set 1 defgw ipaddr 192.168.1.254

Once this is configured you should be able to connect using the 'lan' interface to ipmitool, like this:

ipmitool -I lan -U root -H 192.168.1.101 chassis status

which will prompt you for your ipmi root password, or you can do the following:

echo <new_password> > ~/.racpasswd
chmod 600 ~/.racpasswd

and then use that password file instead of manually entering it each time:

ipmitool -I lan -U root -f ~/.racpasswd -H 192.168.1.101 chassis status

I'm using an 'ipmi' alias that looks like this:

alias ipmi='ipmitool -I lan -U root -f ~/.racpasswd -H'

# which then allows you to do the much shorter:
ipmi 192.168.1.101 chassis status
# OR
ipmi <hostname> chassis status

Finally, if you configure serial console redirection in the bios as follows:

Serial Communication -> Serial Communication:       On with Console Redirection via COM2
Serial Communication -> External Serial Connector:  COM2
Serial Communication -> Redirection After Boot:     Disabled

then you can setup standard serial access in grub.conf and inittab on com2/ttyS1 and get serial console access via IPMI serial-over-lan using the 'lanplus' interface:

ipmitool -I lanplus -U root -f ~/.racpasswd -H 192.168.1.101 sol activate

which I typically use via a shell function:

# ipmi serial-over-lan function
isol() {
   if [ -n "$1" ]; then
       ipmitool -I lanplus -U root -f ~/.racpasswd -H $1 sol activate
   else
       echo "usage: sol <sol_ip>"
   fi
}

# used like:
isol 192.168.1.101
isol <hostname>

Further reading:

Mocking RPMs on CentOS

Mock is a Fedora project that allows you to build RPM packages within a chroot environment, allowing you to build packages for other systems than the one you're running on (e.g. building CentOS 4 32-bit RPMs on a CentOS 5 64-bit host), and ensuring that all the required build dependencies are specified correctly in the RPM spec file.

It's also pretty under-documented, so these are my notes on things I've figured out over the last week setting up a decent mock environment on CentOS 5.

First, I'm using mock 1.0.2 from the EPEL repository, rather than older 0.6.13 available from CentOS Extras. There are apparently backward-compatibility problems with versions of mock > 0.6, but as I'm mostly building C5 packages I decided to go with the newer version. So installation is just:

# Install mock and python-ctypes packages (the latter for better setarch support)
$ sudo yum --enablerepo=epel install mock python-ctypes

# Add yourself to the 'mock' group that will have now been created
$ sudo usermod -G mock gavin

The mock package creates an /etc/mock directory with configs for various OS versions (mostly Fedoras). The first thing you want to tweak there is the site-defaults.cfg file which sets up various defaults for all your builds. Mine now looks like this:

# /etc/mock/site-defaults.cfg

# Set this to true if you've installed python-ctypes
config_opts['internal_setarch'] = True

# Turn off ccache since it was causing errors I haven't bothered debugging
config_opts['plugin_conf']['ccache_enable'] = False

# (Optional) Fake the build hostname to report
config_opts['use_host_resolv'] = False
config_opts['files']['etc/hosts'] = """
127.0.0.1 buildbox.openfusion.com.au nox.openfusion.com.au localhost
"""
config_opts['files']['etc/resolv.conf'] = """
nameserver 127.0.0.1
"""

# Setup various rpm macros to use
config_opts['macros']['%packager'] = 'Gavin Carr <gavin@openfusion.com.au>'
config_opts['macros']['%debug_package'] = '%{nil}'

You can use the epel-5-{i386,x86_64}.cfg configs as-is if you like; I copied them to centos-5-{i386,x86_64}.cfg versions and removed the epel 'extras', 'testing', and 'local' repositories from the yum.conf section, since I typically want to build using only 'core' and 'update' packages.

You can then run a test by doing:

# e.g. initialise a centos-5-i386 chroot environment
$ CONFIG=centos-5-i386
$ mock -r $CONFIG --init

which will setup an initial chroot environment using the given config. If that seemed to work (you weren't inundated with error messages), you can try a build:

# Rebuild the given source RPM within the chroot environment
# usage: mock -r <mock_config> --rebuild /path/to/SRPM e.g.
$ mock -r $CONFIG --rebuild ~/rpmbuild/SRPMS/clix-0.3.4-1.of.src.rpm

If the build succeeds, it drops your packages into the /var/lib/mock/$CONFIG/result directory:

$ ls -1 /var/lib/mock/$CONFIG/result
build.log
clix-0.3.4-1.of.noarch.rpm
clix-0.3.4-1.of.src.rpm
root.log
state.log

If it fails, you can check mock output, the *.log files above for more info, and/or rerun mock with the -v flag for more verbose messaging.

A couple of final notes:

  • the chroot environments are cached, but rebuilding them and checking for updates can be pretty network intensive, so you might want to consider setting up a local repository to pull from. mrepo (available from rpmforge) is pretty good for that.

  • there don't seem to be any hooks in mock to allow you to sign packages you've built, so if you do want signed packages you need to sign them afterwards via a rpm --resign $RPMS.

Backup Regimes with Brackup

After using brackup for a while you find you have a big list of backups sitting on your server, and start to think about cleaning up some of the older ones. The standard brackup tool for this is brackup-target, and the prune and gc (garbage collection) subcommands.

Typical usage is something like this:

# List the backups for a particular target on the server e.g.
TARGET=myserver_images
brackup-target $TARGET list-backups
Backup File                      Backup Date                      Size (B)
-----------                      -----------                      --------
images-1262106544                Thu 31 Dec 2009 03:32:49          1263128
images-1260632447                Sun 13 Dec 2009 08:19:13          1168281
images-1250042378                Wed 25 Nov 2009 06:25:06           977464
images-1239323644                Mon 09 Nov 2009 00:30:34           846523
images-1239577352                Thu 29 Oct 2009 13:03:02           846523
...

# Decide how many backups you want to keep, and prune (delete) the rest
brackup-target --keep-backups 15 $TARGET prune

# Prune just removes the brackup files on the server, so now you need to
# run a garbage collect to delete any 'chunks' that are now orphaned
brackup-target --interactive $TARGET gc

This simple scheme - "keep the last N backups" - works pretty nicely for backups you do relatively infrequently. If you do more frequent backups, however, you might find yourself wanting to be able to implement more sophisticated retention policies. Traditional backup regimes often involve policies like this:

  • keep the last 2 weeks of daily backups
  • keep the last 8 weekly backups
  • keep monthly backups forever

It's not necessarily obvious how to do something like this with brackup, but it's actually pretty straightforward. The trick is to define multiple 'sources' in your brackup.conf, one for each backup 'level' you want to use. For instance, to implement the regime above, you might define the following:

# Daily backups
[SOURCE:images]
path = /data/images
...

# Weekly backups
[SOURCE:images-weekly]
path = /data/images
...

# Monthly backups
[SOURCE:images-monthly]
path = /data/images
...

You'd then use the images-monthly source once a month, the images-weekly source once a week, and the images source the rest of the time. Your list of backups would then look something like this:

Backup File                      Backup Date                      Size (B)
-----------                      -----------                      --------
images-1234567899                Sat 05 Dec 2009 03:32:49          1263128
images-1234567898                Fri 04 Dec 2009 03:19:13          1168281
images-1234567897                Thu 03 Dec 2009 03:19:13          1168281
images-1234567896                Wed 02 Dec 2009 03:19:13          1168281
images-monthly-1234567895        Tue 01 Dec 2009 03:19:13          1168281
images-1234567894                Mon 30 Nov 2009 03:19:13          1168281
images-weekly-1234567893         Sun 29 Nov 2009 03:19:13          1168281
images-1234567892                Sat 28 Nov 2009 03:25:06           977464
...

And when you prune, you want to specify a --source argument, and specify separate --keep-backups settings for each level e.g. for the above:

# Keep 2 weeks worth of daily backups
brackup-target --source images --keep-backups 12 $TARGET prune

# Keep 8 weeks worth of weekly backups
brackup-target --source images-weekly --keep-backups 8 $TARGET prune

# Keep all monthly backups, so we don't prune them at all

# And then garbage collect as normal
brackup-target --interactive $TARGET gc

Anycast DNS

(Okay, brand new year - must be time to get back on the blogging wagon ...)

Linux Journal recently had a really good article by Philip Martin on Anycast DNS. It's well worth a read - I just want to point it out and record a cutdown version of how I've been setting it up recently.

As the super-quick intro, anycast is the idea of providing a network service at multiple points in a network, and then routing requests to the 'nearest' service provider for any particular client. There's a one-to-many relationship between an ip address and the hosts that are providing services on that address.

In the LJ article above, this means you provide a service on a /32 host address, and then use a(n) (interior) dynamic routing protocol to advertise that address to your internal routers. If you're a non-cisco linux shop, that means using quagga/ospf.

The classic anycast service is dns, since it's stateless and benefits from the high availability and low latency benefits of a distributed anycast service.

So here's my quick-and-dirty notes on setting up an anycast dns server on CentOS/RHEL using dnsmasq for dns, and quagga zebra/ospfd for the routing.

  1. First, setup your anycast ip address (e.g. 192.168.255.1/32) on a random virtual loopback interface e.g. lo:0. On CentOS/RHEL, this means you want to setup a /etc/sysconfig/network-scripts/ifcfg-lo:0 file containing:

    DEVICE=lo:0
    IPADDR=192.168.255.1
    NETMASK=255.255.255.255
    ONBOOT=yes
    
  2. Setup your dns server to listen to (at least) your anycast dns interface. With dnsmasq, I use an /etc/dnsmasq.conf config like:

    interface=lo:0
    domain=example.com
    local=/example.com/
    resolv.conf=/etc/resolv.conf.upstream
    expand-hosts
    domain-needed
    bogus-priv
    
  3. Use quagga's zebra/ospfd to advertise this host address to your internal routers. I use a completely vanilla zebra.conf, and an /etc/quagga/ospfd.conf config like:

    hostname myhost
    password mypassword
    log syslog
    !
    router ospf
      ! Local segments (adjust for your network config and ospf areas)
      network 192.168.1.0/24 area 0
      ! Anycast address redistribution
      redistribute connected metric-type 1
      distribute-list ANYCAST out connected
    !
    access-list ANYCAST permit 192.168.255.1/32
    

That's it. Now (as root) start everything up:

ifup lo:0
for s in dnsmasq zebra ospfd; do
  service $s start
  chkconfig $s on
done
tail -50f /var/log/messages

And then check on your router that the anycast dns address is getting advertised and picked up by your router. If you're using cisco, you're probably know how to do that; if you're using linux and quagga, the useful vtysh commands are:

show ip ospf interface <interface>
show ip ospf neighbor
show ip ospf database
show ip ospf route
show ip route

Skype 2.1 on CentOS 5

The new skype 2.1 beta (woohoo - Linux users are now only 2.0 versions behind Windows, way to go Skype!) doesn't come with a CentOS rpm, unlike earlier versions. And the Fedora packages that are available are for FC9 and FC10, which are too recent to work on a stock RHEL/CentOS 5 system.

So here's how I got skype working nicely on CentOS 5.3, using the static binary tarball.

Note that while it appears skype has finally been ported to 64-bit architectures, the only current 64-bit builds are for Ubuntu 8.10+, so installing on a 64-bit CentOS box requires 32-bit libraries to be installed (sigh). Otherwise you get the error: skype: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory.

# the available generic skype binaries are 32-bit, so if you're running a 64-bit
# system you need to make sure you have various 32-bit libraries installed
yum install glib2.i386 qt4.i386 zlib.i386 alsa-lib.i386 libX11.i386 \
  libXv.i386 libXScrnSaver.i386

# installing to /opt (tweak to taste)
cd /tmp
wget http://www.skype.com/go/getskype-linux-beta-static
cd /opt
tar jxvf /tmp/skype_static-2.1.0.47.tar.bz2
ln -s skype_static-2.1.0.47 skype

# Setup some symlinks (the first is required for sounds to work, the second is optional)
ln -s /opt/skype /usr/share/skype
ln -s /opt/skype/skype /usr/bin/skype

You don't seem to need pulseaudio installed (at least with the static binary - I assume it's linked in statically already).

Tangentially, if you have any video problems with your webcam, you might want to check out the updated video drivers available in the kmod-video4linux package from the shiny new ELRepo.org. I'm using their updated uvcvideo module with a Logitech QuickCam Pro 9000 and Genius Slim 1322AF, and both are working well.

Yum Download SRPMs

Found a nice post today on how to use yum to download source RPMs, rather than having to do a manual search on the relevant mirror.

pvmove Disk Migrations

Lots of people make use of linux's lvm (Logical Volume Manager) for providing services such as disk volume resizing and snapshotting under linux. But few people seem to know about the little pvmove utility, which offers a very powerful facility for migrating data between disk volumes on the fly.

Let's say, for example, that you have a disk volume you need to rebuild for some reason. Perhaps you want to change the raid type you're using on it; perhaps you want to rebuild it using larger disks. Whatever the reason, you need to migrate all your data to another temporary disk volume so you can rebuild your initial one.

The standard way of doing this is probably to just create a new filesystem on your new disk volume, and then copy or rsync all the data across. But how do you verify that you have all the data at the end of the copy, and that nothing has changed on your original disk after the copy started? If you did a second rsync and nothing new was copied across, and the disk usage totals exactly match, and you remember to unmount the original disk immediately, you might have an exact copy. But if your original disk data is changing at all, getting a good copy of a large disk volume can actually be pretty tricky.

The elegant lvm/pvmove solution to this problem is this: instead of doing a userspace migration between disk volumes, you add your new volume into the existing volume group, and then tell lvm to move all the physical extents off of your old physical volume, and the migration is magically handled by lvm, without even needing to unmount the logical volume!

# Volume group 'extra' exists on physical volume /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

# Add new physical volume /dev/sdd1 into volume group
$ vgextend extra /dev/sdd1
  Volume group "extra" successfully extended
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 200.00G

# Use pvmove to move physical extents off of old /dev/sdc1 (verbose mode)
$ pvmove -v /dev/sdc1
# Lots of output in verbose mode ...

# Done - remove old physical volume
$ pvremove /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

The joys of linux.

Currency On-Screen Display

Here's a quick hack demonstrating a nice juxtaposition between the power of a CPAN module - in this case Christopher Laco's Finance::Currency::Convert::WebserviceX - and the elegance and utility of the little known osd_cat, putting together a desktop currency rates widget in a handful of lines:

#!/usr/bin/perl

use strict;
use IO::File;
use Finance::Currency::Convert::WebserviceX;

# Configuration
my @currencies = map { uc } @ARGV || qw(USD GBP);
my $base_currency = 'AUD';
my $refresh = 300;   # seconds
my $font = '9x15bold';
# X colours: http://sedition.com/perl/rgb.html
my $colour = 'goldenrod3';
my $align = 'right';
my $pos = 'top';
my $offset = 25;

my $lines = scalar @currencies;
my $osd_refresh = $refresh + 1;
my $osd = IO::File->new(
  "|osd_cat -l $lines -d $osd_refresh -c '$colour' -f $font -p $pos -A $align -o $offset"
) or die "can't open to osd_cat $!";
$osd->autoflush(1);
local $SIG{PIPE} = sub { die "pipe failed: $!" };

my $cc = Finance::Currency::Convert::WebserviceX->new;

while (1) {
  my $output = '';
  $output .= "$_ " . $cc->convert(1, $base_currency, $_) . "\n" for @currencies;
  $osd->print($output);
  sleep $refresh;
}

Most of this is just housekeeping around splitting out various osd_cat options for tweaking, and allowing the set of currencies to display to be passed in as arguments. I haven't bothered setting up any option handling in order to keep the example short, but that would be straightforward.

To use, you just run from the command line in the background:

./currency_osd &

and it shows up in the top right corner of your screen, like so:

alt
text

Tweak to taste, of course.

Delicious CSS Bookmarklet

Further to my Delicious CSS post, here's a javascript bookmarklet to make adding delicious css tags that much easier:

Delicious CSS

Just drag it to your bookmarks toolbar somewhere, and click to use.

Unfortunately, the latest version of the delicious tag form doesn't accept tag arguments in the URL, which is what we need to preset the delicious_css tags we need. To workaround this, you need to also install the Auto-Fill Delicious Tag Field greasemonkey script.

Delicious CSS

And from the quick-weekend-hack department ...

Ever wanted to quickly add a style to a page you were on to make it more usable? If you're a Firefox user with Firebug installed you can do that directly, but it's a temporary and local-only solution. User stylesheets are more permanent, but at least in Firefox (as I've complained before) they're relatively difficult to use, and they're still a single-host solution.

I've wanted a lightweight method of defining and applying user styles on the network for ages now, and this weekend it struck me that a simple and relatively elegant hack would be to just store user styles as delicious tags, applying them to a page via a greasemonkey script.

So here it is: available at userscripts.org is a relatively trivial Delicious CSS greasemonkey script. It looks for delicious bookmarks belonging to a list of specified delicious usernames that are tagged with delicious_css=<current_domain>, and applies any 'style tags' it finds on that bookmark to the current page.

Say if for example you wanted to hide the sidebar on my blog and make the content wider, you might do this in CSS:

div#sidebar { display: none }
div#main    { width: 100% }

To define these rules for Delicious CSS you'd just create a bookmark for www.openfusion.net with the following tags:

delicious_css
delicious_css=www.openfusion.net
div#sidebar=display:none
div#main=width:100%

Note that since delicious tags are space-separated, you have to be careful to avoid spaces.

The general format of the style tags is:

ELT[,ELT...]=STYLE[;STYLE...]

so more complex styles are fine too. Here for example are the styles I'm using for the Sydney Morning Herald:

div.header,div.sidebar,div.aside,div.ad,div.footer=display:none
div#content,div.col1,.span-11=width:100%
body=background:none

which turns this:

SMH Article, unstyled

into this:

SMH Article, restyled

And to setup a new machine, all you need to do is install the Delicious CSS greasemonkey script, adjust the usernames you're trusting, and all your styles are available immediately.

I'll be setting up my userstyles under my 'gavincarr' delicious account, so you should be able to find additional examples at http://delicious.com/gavincarr/delicious_css.

Missing Delicious Feeds

I've been playing with using delicious as a lightweight URL database lately, mostly for use by greasemonkey scripts of various kinds (e.g. squatter_redirect).

For this kind of use I really just need a lightweight anonymous http interface to the bookmarks, and delicious provides a number of nice lightweight RSS and JSON feeds suitable for this purpose.

But it turns out the feed I really need isn't currently available. I mostly want to be able to ask, "Give me the set of bookmarks stored for URL X by user Y", or even better, "Give me the set of bookmarks stored for URL X by users Y, Z, and A".

Delicious have a feed for recent bookmarks by URL:

http://feeds.delicious.com/v2/{format}/url/{url md5}

and a feed for all a user's bookmarks:

http://feeds.delicious.com/v2/{format}/{username}

and feeds for a user's bookmarks limited by tag(s):

http://feeds.delicious.com/v2/{format}/{username}/{tag[+tag+...+tag]}

but not one for a user limited by URL, or for URL limited by user.

Neither alternative approach is both feasible and reliable: searching by url will only return the most recent set of N bookmarks; and searching by user and walking the entire (potentially large) set of their bookmarks is just too slow.

So for now I'm having to workaround the problem by adding a special hostname tag to my bookmarks (e.g. squatter_redirect=www.openfusion.net), and then using the username+tag feed as a proxy for my username+domain search.

Any cluesticks out there? Any nice delicious folk want to whip up a shiny new feed for the adoring throngs? :-)

Squatter Domains, Tracked with Delicious

A few weeks ago I hit a couple of domain squatter sites in quick succession and got a bit annoyed. I asked on twitter/identi.ca whether anyone knew of any kind of domain squatter database on the web, perhaps along the lines of the email RBL lists, but got no replies.

I thought at the time that delicious might be useful for this, in much the same way that Jon Udell has been exploring using delicious for collaborative event curation.

So here's the results of some hacking time this weekend: Squatter Redirect, a greasemonkey script (i.e. firefox only, sorry) that checks whether the sites you visit have been tagged on delicious as squatter domains that should be directed elsewhere, and if so, does the redirect in your browser.

Here's a few squatter domains to try it out on:

The script checks two delicious accounts - your personal account, so you can add your own domains without having to wait for them to be pulled into the 'official' squatter_redirect stream; and the official squatter_redirect delicious account, into which other people's tags are periodically pulled after checking.

Marking a new domain as a squatter simply involves creating a delicious bookmark for the squatter page with a few special tags:

  • squatter_redirect - flagging the bookmark for the attention of the squatter_redirect delicious user
  • squatter_redirect=www.realdomain.com - setting the real domain that you want to be redirected to
  • (optional) squatter_domain=www.baddomain.com - marker for the squatter domain itself (only required if you want to use from your own delicious account)

So www.quagga.org above would be tagged:

squatter_redirect squatter_redirect=www.quagga.net
# or, optionally:
squatter_redirect squatter_redirect=www.quagga.net squatter_domain=www.quagga.org

Feedback/comments welcome.

Quick Linux Box Hardware Overview

Note to self: here's how to get a quick overview of the hardware on a
linux box:
perl -F"\s*:\s*" -ane "chomp \$F[1];
  print qq/\$F[1] / if \$F[0] =~ m/^(model name|cpu MHz)/;
  print qq/\n/ if \$F[0] eq qq/\n/" /proc/cpuinfo
grep MemTotal /proc/meminfo
grep SwapTotal /proc/meminfo
fdisk -l /dev/[sh]d? 2>/dev/null | grep Disk

Particularly useful if you're auditing a bunch of machines (via an ssh loop or clusterssh or something) and want a quick 5000-foot view of what's there.

ASX20 Announcements Review

Question: you're a small investor with a handful of small share
investments in Australian companies listed on the ASX. How do you
keep track of the announcements those companies make to the ASX?

There are a couple of manual methods you can use. You can bookmark the announcements page on the various company websites you're interested in and go and check them periodically, but that's obviously pretty slow and labour intensive.

Or you can go to a centralised point, the ASX Announcements Page, and search for all announcements from there. Unfortunately, the ASX only lets you search for one company at a time, so that's also pretty slow, and still requires you do all the checking manually - there's no automation available without subscribing to the ASX's expensive data feed services.

There are also other third-party subscription services you can pay for that will do this for you, but it's annoying to have to pay for what is rightly public information.

The better answer is for the company themselves to provide their announcements through some sort of push mechanism. The traditional method is via email, where you subscribe to company announcements, and they show up in your inbox shortly after they're released.

But the best and most modern solution is for companies to provide a syndication feed on their website in a format like RSS or Atom, which can be monitored and read using feed readers like Google Reader, Mozilla Thunderbird, or Omea Reader. Feeds are superior to email announcments in that they're centralised and lightweight (big companies don't have to send out thousands of emails, for instance), and they're a standardised format, and so can be remixed and repurposed in lots of interesting ways.

So out of interest I did a quick survey of the current ASX20 (the top 20 list of companies on the ASX according to Standards and Poors) to see how many of them support syndicating their announcements either by email or by RSS/Atom. Here are the results:

Table: Company Announcement Availability, ASX20
Company via web via email via RSS/Atom
AMP tick tick RSS Feed
ANZ tick
BHP tick tick
Brambles (BXB) tick tick
Commonwealth Bank (CBA) tick tick
CSL tick tick
Fosters (FGL) tick
Macquarie Group (MQG) tick tick
NAB tick
Newcrest Mining (NCM) tick
Origin Energy (ORG) tick tick RSS Feed
QBE Insurance (QBE) tick
Rio Tinto (RIO) tick tick RSS Feed
Suncorp Metway (SUN) tick tick RSS Feed
Telstra (TLS) tick tick RSS Feed
Wesfarmers (WES) tick tick RSS Feed
Westfield (WDC) tick tick
Westpac (WBC) tick
Woodside Petroleum (WPL) tick tick
Woolworths (WOW) tick tick RSS Feed

Some summary ratings:

  • Grade: A - announcements available via web, email, and RSS: AMP, ORG, RIO, SUN, TLS, WES, WOW (7)
  • Grade: B - announcements available via web and email: BHP, BXB, CBA, CSL, MQG, WDC, WPL (7)
  • Grade: C - announcements available via web: ANZ, FGL, NAB, NCM, QBE, WBC (6)

Overall, I'm relatively impressed that 7 of the ASX20 do support RSS. On the down side, the fact that another 6 don't even provide an email announcements service is pretty ordinary, especially considering the number of retail shareholders who hold some of these stocks (e.g. three of the big four banks, bonus points to CBA, the standout).

Special bonus points go to:

  • Suncorp Metway and Wesfarmers, who also offer RSS feeds for upcoming calendar events;

  • Rio Tinto, who have their own announcements twitter account.

Corrections and updates are welcome in the comments.