pvmove Disk Migrations

Lots of people make use of linux's lvm (Logical Volume Manager) for providing services such as disk volume resizing and snapshotting under linux. But few people seem to know about the little pvmove utility, which offers a very powerful facility for migrating data between disk volumes on the fly.

Let's say, for example, that you have a disk volume you need to rebuild for some reason. Perhaps you want to change the raid type you're using on it; perhaps you want to rebuild it using larger disks. Whatever the reason, you need to migrate all your data to another temporary disk volume so you can rebuild your initial one.

The standard way of doing this is probably to just create a new filesystem on your new disk volume, and then copy or rsync all the data across. But how do you verify that you have all the data at the end of the copy, and that nothing has changed on your original disk after the copy started? If you did a second rsync and nothing new was copied across, and the disk usage totals exactly match, and you remember to unmount the original disk immediately, you might have an exact copy. But if your original disk data is changing at all, getting a good copy of a large disk volume can actually be pretty tricky.

The elegant lvm/pvmove solution to this problem is this: instead of doing a userspace migration between disk volumes, you add your new volume into the existing volume group, and then tell lvm to move all the physical extents off of your old physical volume, and the migration is magically handled by lvm, without even needing to unmount the logical volume!

# Volume group 'extra' exists on physical volume /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

# Add new physical volume /dev/sdd1 into volume group
$ vgextend extra /dev/sdd1
  Volume group "extra" successfully extended
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 200.00G

# Use pvmove to move physical extents off of old /dev/sdc1 (verbose mode)
$ pvmove -v /dev/sdc1
# Lots of output in verbose mode ...

# Done - remove old physical volume
$ pvremove /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

The joys of linux.

Currency On-Screen Display

Here's a quick hack demonstrating a nice juxtaposition between the power of a CPAN module - in this case Christopher Laco's Finance::Currency::Convert::WebserviceX - and the elegance and utility of the little known osd_cat, putting together a desktop currency rates widget in a handful of lines:

#!/usr/bin/perl

use strict;
use IO::File;
use Finance::Currency::Convert::WebserviceX;

# Configuration
my @currencies = map { uc } @ARGV || qw(USD GBP);
my $base_currency = 'AUD';
my $refresh = 300;   # seconds
my $font = '9x15bold';
# X colours: http://sedition.com/perl/rgb.html
my $colour = 'goldenrod3';
my $align = 'right';
my $pos = 'top';
my $offset = 25;

my $lines = scalar @currencies;
my $osd_refresh = $refresh + 1;
my $osd = IO::File->new(
  "|osd_cat -l $lines -d $osd_refresh -c '$colour' -f $font -p $pos -A $align -o $offset"
) or die "can't open to osd_cat $!";
$osd->autoflush(1);
local $SIG{PIPE} = sub { die "pipe failed: $!" };

my $cc = Finance::Currency::Convert::WebserviceX->new;

while (1) {
  my $output = '';
  $output .= "$_ " . $cc->convert(1, $base_currency, $_) . "\n" for @currencies;
  $osd->print($output);
  sleep $refresh;
}

Most of this is just housekeeping around splitting out various osd_cat options for tweaking, and allowing the set of currencies to display to be passed in as arguments. I haven't bothered setting up any option handling in order to keep the example short, but that would be straightforward.

To use, you just run from the command line in the background:

./currency_osd &

and it shows up in the top right corner of your screen, like so:

alt
text

Tweak to taste, of course.

Delicious CSS Bookmarklet

Further to my Delicious CSS post, here's a javascript bookmarklet to make adding delicious css tags that much easier:

Delicious CSS

Just drag it to your bookmarks toolbar somewhere, and click to use.

Unfortunately, the latest version of the delicious tag form doesn't accept tag arguments in the URL, which is what we need to preset the delicious_css tags we need. To workaround this, you need to also install the Auto-Fill Delicious Tag Field greasemonkey script.

Delicious CSS

And from the quick-weekend-hack department ...

Ever wanted to quickly add a style to a page you were on to make it more usable? If you're a Firefox user with Firebug installed you can do that directly, but it's a temporary and local-only solution. User stylesheets are more permanent, but at least in Firefox (as I've complained before) they're relatively difficult to use, and they're still a single-host solution.

I've wanted a lightweight method of defining and applying user styles on the network for ages now, and this weekend it struck me that a simple and relatively elegant hack would be to just store user styles as delicious tags, applying them to a page via a greasemonkey script.

So here it is: available at userscripts.org is a relatively trivial Delicious CSS greasemonkey script. It looks for delicious bookmarks belonging to a list of specified delicious usernames that are tagged with delicious_css=<current_domain>, and applies any 'style tags' it finds on that bookmark to the current page.

Say if for example you wanted to hide the sidebar on my blog and make the content wider, you might do this in CSS:

div#sidebar { display: none }
div#main    { width: 100% }

To define these rules for Delicious CSS you'd just create a bookmark for www.openfusion.net with the following tags:

delicious_css
delicious_css=www.openfusion.net
div#sidebar=display:none
div#main=width:100%

Note that since delicious tags are space-separated, you have to be careful to avoid spaces.

The general format of the style tags is:

ELT[,ELT...]=STYLE[;STYLE...]

so more complex styles are fine too. Here for example are the styles I'm using for the Sydney Morning Herald:

div.header,div.sidebar,div.aside,div.ad,div.footer=display:none
div#content,div.col1,.span-11=width:100%
body=background:none

which turns this:

SMH Article, unstyled

into this:

SMH Article, restyled

And to setup a new machine, all you need to do is install the Delicious CSS greasemonkey script, adjust the usernames you're trusting, and all your styles are available immediately.

I'll be setting up my userstyles under my 'gavincarr' delicious account, so you should be able to find additional examples at http://delicious.com/gavincarr/delicious_css.

Missing Delicious Feeds

I've been playing with using delicious as a lightweight URL database lately, mostly for use by greasemonkey scripts of various kinds (e.g. squatter_redirect).

For this kind of use I really just need a lightweight anonymous http interface to the bookmarks, and delicious provides a number of nice lightweight RSS and JSON feeds suitable for this purpose.

But it turns out the feed I really need isn't currently available. I mostly want to be able to ask, "Give me the set of bookmarks stored for URL X by user Y", or even better, "Give me the set of bookmarks stored for URL X by users Y, Z, and A".

Delicious have a feed for recent bookmarks by URL:

http://feeds.delicious.com/v2/{format}/url/{url md5}

and a feed for all a user's bookmarks:

http://feeds.delicious.com/v2/{format}/{username}

and feeds for a user's bookmarks limited by tag(s):

http://feeds.delicious.com/v2/{format}/{username}/{tag[+tag+...+tag]}

but not one for a user limited by URL, or for URL limited by user.

Neither alternative approach is both feasible and reliable: searching by url will only return the most recent set of N bookmarks; and searching by user and walking the entire (potentially large) set of their bookmarks is just too slow.

So for now I'm having to workaround the problem by adding a special hostname tag to my bookmarks (e.g. squatter_redirect=www.openfusion.net), and then using the username+tag feed as a proxy for my username+domain search.

Any cluesticks out there? Any nice delicious folk want to whip up a shiny new feed for the adoring throngs? :-)

Squatter Domains, Tracked with Delicious

A few weeks ago I hit a couple of domain squatter sites in quick succession and got a bit annoyed. I asked on twitter/identi.ca whether anyone knew of any kind of domain squatter database on the web, perhaps along the lines of the email RBL lists, but got no replies.

I thought at the time that delicious might be useful for this, in much the same way that Jon Udell has been exploring using delicious for collaborative event curation.

So here's the results of some hacking time this weekend: Squatter Redirect, a greasemonkey script (i.e. firefox only, sorry) that checks whether the sites you visit have been tagged on delicious as squatter domains that should be directed elsewhere, and if so, does the redirect in your browser.

Here's a few squatter domains to try it out on:

The script checks two delicious accounts - your personal account, so you can add your own domains without having to wait for them to be pulled into the 'official' squatter_redirect stream; and the official squatter_redirect delicious account, into which other people's tags are periodically pulled after checking.

Marking a new domain as a squatter simply involves creating a delicious bookmark for the squatter page with a few special tags:

  • squatter_redirect - flagging the bookmark for the attention of the squatter_redirect delicious user
  • squatter_redirect=www.realdomain.com - setting the real domain that you want to be redirected to
  • (optional) squatter_domain=www.baddomain.com - marker for the squatter domain itself (only required if you want to use from your own delicious account)

So www.quagga.org above would be tagged:

squatter_redirect squatter_redirect=www.quagga.net
# or, optionally:
squatter_redirect squatter_redirect=www.quagga.net squatter_domain=www.quagga.org

Feedback/comments welcome.

Quick Linux Box Hardware Overview

Note to self: here's how to get a quick overview of the hardware on a
linux box:
perl -F"\s*:\s*" -ane "chomp \$F[1];
  print qq/\$F[1] / if \$F[0] =~ m/^(model name|cpu MHz)/;
  print qq/\n/ if \$F[0] eq qq/\n/" /proc/cpuinfo
grep MemTotal /proc/meminfo
grep SwapTotal /proc/meminfo
fdisk -l /dev/[sh]d? 2>/dev/null | grep Disk

Particularly useful if you're auditing a bunch of machines (via an ssh loop or clusterssh or something) and want a quick 5000-foot view of what's there.

ASX20 Announcements Review

Question: you're a small investor with a handful of small share
investments in Australian companies listed on the ASX. How do you
keep track of the announcements those companies make to the ASX?

There are a couple of manual methods you can use. You can bookmark the announcements page on the various company websites you're interested in and go and check them periodically, but that's obviously pretty slow and labour intensive.

Or you can go to a centralised point, the ASX Announcements Page, and search for all announcements from there. Unfortunately, the ASX only lets you search for one company at a time, so that's also pretty slow, and still requires you do all the checking manually - there's no automation available without subscribing to the ASX's expensive data feed services.

There are also other third-party subscription services you can pay for that will do this for you, but it's annoying to have to pay for what is rightly public information.

The better answer is for the company themselves to provide their announcements through some sort of push mechanism. The traditional method is via email, where you subscribe to company announcements, and they show up in your inbox shortly after they're released.

But the best and most modern solution is for companies to provide a syndication feed on their website in a format like RSS or Atom, which can be monitored and read using feed readers like Google Reader, Mozilla Thunderbird, or Omea Reader. Feeds are superior to email announcments in that they're centralised and lightweight (big companies don't have to send out thousands of emails, for instance), and they're a standardised format, and so can be remixed and repurposed in lots of interesting ways.

So out of interest I did a quick survey of the current ASX20 (the top 20 list of companies on the ASX according to Standards and Poors) to see how many of them support syndicating their announcements either by email or by RSS/Atom. Here are the results:

Table: Company Announcement Availability, ASX20
Company via web via email via RSS/Atom
AMP tick tick RSS Feed
ANZ tick
BHP tick tick
Brambles (BXB) tick tick
Commonwealth Bank (CBA) tick tick
CSL tick tick
Fosters (FGL) tick
Macquarie Group (MQG) tick tick
NAB tick
Newcrest Mining (NCM) tick
Origin Energy (ORG) tick tick RSS Feed
QBE Insurance (QBE) tick
Rio Tinto (RIO) tick tick RSS Feed
Suncorp Metway (SUN) tick tick RSS Feed
Telstra (TLS) tick tick RSS Feed
Wesfarmers (WES) tick tick RSS Feed
Westfield (WDC) tick tick
Westpac (WBC) tick
Woodside Petroleum (WPL) tick tick
Woolworths (WOW) tick tick RSS Feed

Some summary ratings:

  • Grade: A - announcements available via web, email, and RSS: AMP, ORG, RIO, SUN, TLS, WES, WOW (7)
  • Grade: B - announcements available via web and email: BHP, BXB, CBA, CSL, MQG, WDC, WPL (7)
  • Grade: C - announcements available via web: ANZ, FGL, NAB, NCM, QBE, WBC (6)

Overall, I'm relatively impressed that 7 of the ASX20 do support RSS. On the down side, the fact that another 6 don't even provide an email announcements service is pretty ordinary, especially considering the number of retail shareholders who hold some of these stocks (e.g. three of the big four banks, bonus points to CBA, the standout).

Special bonus points go to:

  • Suncorp Metway and Wesfarmers, who also offer RSS feeds for upcoming calendar events;

  • Rio Tinto, who have their own announcements twitter account.

Corrections and updates are welcome in the comments.

The Joy of Scripting

Was going home on the train with Hannah (8) this afternoon, and she says, "Dad, what's the longest word you can make without using any letters with tails or stalks?". "Do you really want to know?", I asked, and whipping out the trusty laptop, we had an answer within a couple of train stops:

egrep -v '[A-Zbdfghjklpqty]' /usr/share/dict/words | \
perl -nle 'chomp; push @words, $_;
  END { @words = sort { length($b) cmp length($a) } @words;
        print join "\n", @words[0 .. 9] }'

noncarnivorousness
nonceremoniousness
overcensoriousness
carnivorousnesses
noncensoriousness
nonsuccessiveness
overconsciousness
semiconsciousness
unacrimoniousness
uncarnivorousness

Now I just need to teach her how to do that.

Cityrail Timetables Greasemonkey Script

I got sufficiently annoyed over last week's Cityrail Timetable fiasco that I thought I'd contribute something to the making-Cityrail-bearable software ecosystem.

So this post is to announce a new Greasemonkey script called Cityrail Timetables Reloaded [CTR], available at the standard Greasemonkey repository on userscripts.org, that cleans up and extensively refactors Cityrail's standard timetable pages.

Here's a screenshot of Cityrail's initial timetable page for the Northern line:

Cityrail standard timetable

and here's the same page with CTR loaded:

Cityrail timetable via CTR

CTR loads a configurable number of pages rather than forcing you to click through them one by one, and in fact will load the whole set if you tell it to.

It also has support for you specifying the 'from' and 'to' stations you're travelling between, and will highlight them for you, as well as omit stations well before or well after yours, and any trains that don't actually stop at your stations. This can compress the output a lot, allowing you to fit more pages on your screen:

Cityrail timetable via CTR

I can't see Cityrail having a problem with this script since it's just taking their pages and cleaning them up, but we shall see.

If you're a firefox/greasemonkey user please try it out and post your comments/feedback here or on the userscripts site.

Enjoy!

Soul Communications FAIL!

What's a blog if not a vehicle for an occasional rant?

I used to have a mobile with Soul Communications, and recently changed to another provider because Soul cancelled the plan I'd been on with them for 3 or 4 years. I ported my number, and gathered that that would close the Soul account, and all would be good. Soul has a credit card on that account that they've billed for the last 3 years or so without problems. I've had nothing from them to indicate there are any issues.

And so today I get a Notice of Demand and Disconnection from Soul advising me that my account is overdue, charging me additional debt recovery fees, and advising that if I don't pay all outstanding amounts immediately it'll be referred to debt collectors.

Nice work Soul.

So let's recap. I've had no notices that my account is overdue, no contact from anyone from Soul, no indication that there are any issues, and then a Notice of Demand?

I go and check my email, in case I've missed something. Two emails from Soul since the beginning of the year, the most recent from a week ago. They're HTML-only, of course, and I use a text email client, but hey, I'll go the extra mile and fire up an HTML email client to workaround the fact that multipart/alternative is a bit too hard.

The emails just say, "Your Soul Bill is Now Available", and point to the "MySoul Customer Portal". (Yes, it would be nice if it was a link to the actual bill, of course, rather than expecting me to navigate through their crappy navigation system, but that's clearly a bit too sophisticated as well; but I digress.) There's no indication in any of the emails that anything is amiss, like a "Your account is overdue" message or something. So no particular reason I would have bothered to go and actually login to their portal, find my bill, and review it, right? They've got the credit card, right?

So let's go and check the bill. Go to "MySoul Salvation Portal", or whatever it's called, dig out obscure customer number and sekrit password, and login. Except I can't. "This account is inactive."

Aaaargh!

So let's recap:

  • account has been cancelled due to move to another carrier (yippee!)

  • can't login to super-customer-portal to get bills

  • emails from Soul do not indicate there are any problems with the account

  • no other emails from the Soul saying "we have a problem"

  • maybe they could, like, phone my mobile, since they do have the number - no, too hard!

Epic mega stupendous FAIL! What a bunch of lusers.

So now I've phoned Soul, had a rant, and been promised that they'll email me the outstanding accounts. That was half an hour ago, and nothing in the inbox yet. I get the feeling they don't really want to be paid.

And I feel so much better now. :-)

mod_auth_tkt 2.0.1

I'm happy to announce the release of mod_auth_tkt 2.0.1, the first full release of mod_auth_tkt in a couple of years. The 2.0.x release includes numerous enhancements and bugfixes, including guest login support and full support for apache 2.2.

mod_auth_tkt is a lightweight single-sign-on authentication module for apache, supporting versions 1.3.x, 2.0.x, and 2.2.x. It uses secure cookie-based tickets to implement a single-signon framework that works across multiple apache instances and servers. It's also completely repository agnostic, relying on a user-supplied script to perform the actual authentication.

The release is available as a tarball and various RPMs from the mod_auth_tkt homepage.

Testing Disqus

I'm trying out disqus, since I like the idea of being able to track/collate my comments across multiple endpoints, rather than have them locked in to various blogging systems. So this is a test post to try out commenting. Please feel free to comment ad nauseum below (and sign up for a disqus account, if you don't already have one).

Open Fusion RPM Repository

Updated 2014-09-26 for CentOS 7.

Over the last few years I've built up quite a collection of packages for CentOS, and distribute them via a yum repository. They're typically packages that aren't included in DAG/RPMForge when I need them, so I just build them myself. In case they're useful to other people, this post documents the repository locations, and how you can get setup to make use of it yourself.

Obligatory Warning: this is a personal repository, so it's primarily for packages I want to use myself on a particular platform i.e. coverage is uneven, and packages won't be as well tested as a large repository like RPMForge. Also, I routinely build packages that replace core packages, so you'll want the repo disabled by default if that concerns you. Use at your own risk, packages may nuke your system and cause global warming, etc. etc.

Locations:

To add the Open Fusion repository to your yum configuration, just install the following 'openfusion-release' package:

# CentOS 5:
sudo rpm -Uvh http://repo.openfusion.net/centos5-x86_64/openfusion-release-0.7-1.of.el5.noarch.rpm
# CentOS 6:
sudo rpm -Uvh http://repo.openfusion.net/centos6-x86_64/openfusion-release-0.7-1.of.el6.noarch.rpm
# CentOS 7:
sudo rpm -Uvh http://repo.openfusion.net/centos7-x86_64/openfusion-release-0.7-1.of.el7.noarch.rpm

And here are the openfusion-release packages as links:

Feedback and suggestions are welcome. Packaging requests are also welcome, particularly when they involve my wishlist. ;-)

Enjoy.

Questions That Cannot Be Answered

Was thinking this morning about my interactions with the web over the last couple of weeks, and how I've been frustrated with not being able to (simply) get answers to relatively straightforward questions from the automated web. This is late 2008, and Google and Google Maps and Wikipedia and Freebase etc. etc. have clearly pushed back the knowledge boundaries here hugely, but at the same time lots of relatively simple questions are as yet largely unanswerable.

By way of qualification, I mean are not answerable in an automated fashion, not that they cannot be answered by asking the humans on the web (invoking the 'lazyweb'). I also don't mean that these questions are impossible to answer given the time and energy to collate the results available - I mean that they are not simply and reasonably trivially answerable, more or less without work on my part. (e.g. "How do I get to address X" was kind of answerable before Google Maps, but they were arguably the ones who made it more-or-less trivial, and thereby really solved the problem.)

So in the interests of helping delineate some remaining frontiers, and challenging ourselves, here's my catalogue of questions from the last couple of weeks:

  • what indoor climbing gyms are there in Sydney?

  • where are the indoor climbing gyms in Sydney (on a map)?

  • what are the closest gyms to my house?

  • how much are the casual rates for adults and children for the gyms near my house?

  • what are the opening hours for the gyms near my house?

  • what shops near my house sell the Nintendo Wii?

  • what shops near my house have the Wii in stock?

  • what shops near my house are selling Wii bundles?

  • what is the pricing for the Wii and Wii bundles from shops near my house?

  • of the shops near my house that sell the Wii, who's open late on Thursdays?

  • of the shops near my house that sell the Wii, what has been the best pricing on bundles over the last 6 months?

  • trading off distance to travel against price, where should I buy a Wii?

  • what are the "specials" at the supermarkets near my house this week?

  • given our grocery shopping habits and the current specials, which supermarket should I shop at this week?

  • I need cereal X - do any of the supermarkets have have it on special?

That's a useful starting set from the last two weeks. Anyone else? What are your recent questions-that-cannot-be-answered? (And if you blog, tag with #qtcba pretty please).

CSS and Javascript Minification

I've been playing with the very nice YSlow firefox plugin recently, while doing some front-end optimisation on a Catalyst web project.

Most of YSlow's tuning tips were reasonably straightforward, but I wasn't sure how to approach the concatenation and minification of CSS and javascript files that they recommend.

Turns out - as is often the case - there's a very nice packaged solution on CPAN.

The File::Assets module provides concatentation and minification for CSS and Javascript 'assets' for a web page, using the CSS::Minifier (::XS) and JavaScript::Minifier (::XS) modules for minification. To use, you add a series of .css and .js files in building your page, and then 'export' them at the end, which generates a concatenated and minified version of each type in an export directory, and an appropriate link to the exported version. You can do separate exports for CSS and Javascript if you want to follow the Yahoo/YSlow recommendation of putting your stylesheets at the top and your scripts at the bottom.

There's also a Catalyst::Plugin::Assets module to facilitate using File::Assets from Catalyst.

I use Mason for my Catalyst views (I prefer using perl in my views rather than having another mini-language to learn) and so use this as follows.

First, you have to configure Catalyst::Plugin::Assets in your project config file (e.g. $PROJECT_HOME/project.yml):

Plugin::Assets:
    path: /static
    output_path: build/
    minify: 1

Next, I set the per-page javascript and and css files I want to include as mason page attributes in my views (using an arrayref if there's more than one item of the given type) e.g.

%# in my person view
&lt;%attr&gt;
js => [ 'jquery.color.js', 'person.js' ]
css => 'person.css'
&lt;/%attr&gt;

Then in my top-level autohandler, I include both global and per-page assets like this:

&lt;%init&gt;
# Asset collation, javascript (globals, then per-page)
$c->assets->include('js/jquery.min.js');
$c->assets->include('js/global.js');
if (my $js = $m->request_comp->attr_if_exists('js')) {
  if (ref $js && ref $js eq 'ARRAY') {
    $c->assets->include("js/$_") foreach @$js;
  } else {
    $c->assets->include("js/$js");
  }
}
# The CSS version is left as an exercise for the reader ...
# ...
&lt;/%init&gt;

Then, elsewhere in the autohandler, you add an exported link at the appropriate point in the page:

&lt;% $c->assets->export('text/javascript') %&gt;

This generates a link something like the following (wrapped here):

&lt;script src="http://www.example.com/static/build/assets-ec556d1e.js"
  type="text/javascript"&gt;&lt;/script&gt;

Beautiful, easy, maintainable.

Basic KVM on CentOS 5

I've been using kvm for my virtualisation needs lately, instead of xen, and finding it great. Disadvantages are that it requires hardware virtualisation support, and so only works on newer Intel/AMD CPUs. Advantages are that it's baked into recent linux kernels, and so more or less Just Works out of the box, no magic kernels required.

There are some pretty useful resources covering this stuff out on the web - the following sites are particularly useful:

There's not too much specific to CentOS though, so here's the recipe I've been using for CentOS 5:

# Confirm your CPU has virtualisation support
egrep 'vmx|svm' /proc/cpuinfo

# Install the kvm and qemu packages you need
# From the CentOS Extras repository (older):
yum install --enablerepo=extras kvm kmod-kvm qemu
# OR from my repository (for most recent kernels only):
ARCH=$(uname -i)
OF_MREPO=http://www.openfusion.com.au/mrepo/centos5-$ARCH/RPMS.of/
rpm -Uvh $OF_MREPO/openfusion-release-0.3-1.of.noarch.rpm
yum install kvm kmod-kvm qemu

# Install the appropriate kernel module - either:
modprobe kvm-intel
# OR:
modprobe kvm-amd
lsmod | grep kvm

# Check the kvm device exists
ls -l /dev/kvm

# I like to run my virtual machines as a 'kvm' user, not as root
chgrp kvm /dev/kvm
chmod 660 /dev/kvm
ls -l /dev/kvm
useradd -r -g kvm kvm

# Create a disk image to use
cd /data/images
IMAGE=centos5x.img
# Note that the specified size is a maximum - the image only uses what it needs
qemu-img create -f qcow2 $IMAGE 10G
chown kvm $IMAGE

# Boot an install ISO on your image and do the install
MEM=1024
ISO=/path/to/CentOS-5.2-x86_64-bin-DVD.iso
# ISO=/path/to/WinXP.iso
qemu-kvm -hda $IMAGE -m ${MEM:-512} -cdrom $ISO -boot d
# I usually just do a minimal install with std defaults and dhcp, and configure later

# After your install has completed restart without the -boot parameter
# This should have outgoing networking working, but pings don't work (!)
qemu-kvm -hda $IMAGE -m ${MEM:-512} &

That should be sufficient to get you up and running with basic outgoing networking (for instance as a test desktop instance). In qemu terms this is using 'user mode' networking which is easy but slow, so if you want better performance, or if you want to allow incoming connections (e.g. as a server) you need some extra magic, which I'll cover in a "subsequent post":kvm_bridging.

Simple KVM Bridging

Following on from my post yesterday on "Basic KVM on CentOS 5", here's how to setup simple bridging to allow incoming network connections to your VM (and to get other standard network functionality like pings working). This is a simplified/tweaked version of Hadyn Solomon's bridging instructions.

Note this this is all done on your HOST machine, not your guest.

For CentOS:

# Install bridge-utils
yum install bridge-utils

# Add a bridge interface config file
vi /etc/sysconfig/network-scripts/ifcfg-br0
# DHCP version
ONBOOT=yes
TYPE=Bridge
DEVICE=br0
BOOTPROTO=dhcp
# OR, static version
ONBOOT=yes
TYPE=Bridge
DEVICE=br0
BOOTPROTO=static
IPADDR=xx.xx.xx.xx
NETMASK=255.255.255.0

# Make your primary interface part of this bridge e.g.
vi /etc/sysconfig/network-scripts/ifcfg-eth0
# Add:
BRIDGE=br0
# Optional: comment out BOOTPROTO/IPADDR lines, since they're
# no longer being used (the br0 takes precedence)

# Add a script to connect your guest instance to the bridge on guest boot
vi /etc/qemu-ifup
#!/bin/bash
BRIDGE=$(/sbin/ip route list | awk '/^default / { print $NF }')
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif $BRIDGE $1
# END OF SCRIPT
# Silence a qemu warning by creating a noop qemu-ifdown script
vi /etc/qemu-ifdown
#!/bin/bash
# END OF SCRIPT
chmod +x /etc/qemu-if*

# Test - bridged networking uses a 'tap' networking device
NAME=c5-1
qemu-kvm -hda $NAME.img -name $NAME -m ${MEM:-512} -net nic -net tap &

Done. This should give you VMs that are full network members, able to be pinged and accessed just like a regular host. Bear in mind that this means you'll want to setup firewalls etc. if you're not in a controlled environment.

Notes:

  • If you want to run more than one VM on your LAN, you need to set the guest MAC address explicitly, since otherwise qemu uses a static default that will conflict with any other similar VM on the LAN. e.g. do something like:
# HOST_ID, identifying your host machine (2-digit hex)
HOST_ID=91
# INSTANCE, identifying the guest on this host (2-digit hex)
INSTANCE=01
# Startup, but with explicit macaddr
NAME=c5-1
qemu-kvm -hda $NAME.img -name $NAME -m ${MEM:-512} \
  -net nic,macaddr=00:16:3e:${HOST_ID}:${INSTANCE}:00 -net tap &
  • This doesn't use the paravirtual ('virtio') drivers that Hadyn mentions, as these aren't available until kernel 2.6.25, so they're not available to CentOS linux guests without a kernel upgrade.

Catalyst + Screen

I'm an old-school developer, doing all my hacking using terms, the command line, and vim, not a heavyweight IDE. Hacking perl Catalyst projects (and I imagine other MVC-type frameworks) can be slightly more challenging in this kind of environment because of the widely-branching directory structure. A single conceptual change can easily touch controller classes, model classes, view templates, and static javascript or css files, for instance.

I've found GNU screen to work really well in this environment. I use per-project screen sessions set up specifically for Catalyst - for my 'usercss' project, for instance, I have a ~/.screenrc-usercss config that looks like this:

source $HOME/.screenrc
setenv PROJDIR ~/work/usercss
setenv PROJ UserCSS
screen -t home
stuff "cd ~^Mclear^M"
screen -t top
stuff "cd $PROJDIR^Mclear^M"
screen -t lib
stuff "cd $PROJDIR/lib/$PROJ^Mclear^M"
screen -t controller
stuff "cd $PROJDIR/lib/Controller^Mclear^M"
screen -t schema
stuff "cd $PROJDIR/lib/$PROJ/Schema/Result^Mclear^M"
screen -t htdocs
stuff "cd $PROJDIR/root/htdocs^Mclear^M"
screen -t static
stuff "cd $PROJDIR/root/static^Mclear^M"
screen -t sql
stuff "cd $PROJDIR^Mclear^M"
select 0

(the ^M sequences there are actual Ctrl-M newline characters).

So a:

screen -c ~/.screenrc-usercss

will give me a set of eight labelled screen windows: home, top, lib, controller, schema, htdocs, static, and sql. I usually run a couple of these in separate terms, like this:

dual-screen screenshot

To make this completely brainless, I also have the following bash function defined in my ~/.bashrc file:

sc ()
{
  SC_SESSION=$(screen -ls | egrep -e "\.$1.*Detached" | \
    awk '{ print $1 }' | head -1);
  if [ -n "$SC_SESSION" ]; then
    xtitle $1;
    screen -R $SC_SESSION;
  elif [ -f ~/.screenrc-$1 ]; then
    xtitle $1;
    screen -S $1 -c ~/.screenrc-$1
  else
    echo "Unknown session type '$1'!"
  fi
}

which lets me just do sc usercss, which reattaches to the first detached 'usercss' screen session, if one is available, or starts up a new one.

Fast, flexible, lightweight. Choose any 3.

Brackup Tips and Tricks

Further to my earlier post, I've spent a good chunk of time implementing brackup over the last few weeks, both at home for my personal backups, and at $work on some really large trees. There are a few gotchas along the way, so thought I'd document some of them here.

Active Filesystems

First, as soon as you start trying to brackup trees on any size you find that brackup aborts if it finds a file has changed between the time it initially walks the tree and when it comes to back it up. On an active filesystem this can happen pretty quickly.

This is arguably reasonable behaviour on brackup's part, but it gets annoying pretty fast. The cleanest solution is to use some kind of filesystem snapshot to ensure you're backing up a consistent view of your data and a quiescent filesystem.

I'm using linux and LVM, so I'm using LVM snapshots for this, using something like:

SIZE=250G
VG=VolGroup00
PART=${1:-export}

mkdir -p /${PART}_snap
lvcreate -L$SIZE --snapshot --permission r -n ${PART}_snap /dev/$VG/$PART && \
  mount -o ro /dev/$VG/${PART}_snap /${PART}_snap

which snapshots /dev/VolGroup00/export to /dev/VolGroup00/export_snap, and mounts the snapshot read-only on /export_snap.

The reverse, post-backup, is similar:

VG=VolGroup00
PART=${1:-export}

umount /${PART}_snap && \
  lvremove -f /dev/$VG/${PART}_snap

which unmounts the snapshot and then deletes it.

You can then do your backup using the /${PART}_snap tree instead of your original ${PART} one.

Brackup Digests

So snapshots works nicely. Next wrinkle is that by default brackup writes its digest cache file to the root of your source tree, which in this case is readonly. So you want to tell brackup to put that in the original tree, not the snapshot, which you do in the your ~/.brackup.conf file e.g.

[SOURCE:home]
path = /export_snap/home
digestdb_file = /exportb/home/.brackup-digest.db
ignore = \.brackup-digest.db$

I've also added an explicit ignore rule for these digest files here. You don't really need to back these up as they're just caches, and they can get pretty large. Brackup automatically skips the digestdb_file for you, but it doesn't skip any others you might have, if for instance you're backing up the same tree to multiple targets.

Synching Backups Between Targets

Another nice hack you can do with brackup is sync backups on filesystem-based targets (that is, Target::Filesystem, Target::Ftp, and Target::Sftp) between systems. For instance, I did my initial home directory backup of about 10GB onto my laptop, and then carried my laptop into where my server is located, and then rsync-ed the backup from my laptop to the server. Much faster than copying 10GB of data over an ADSL line!

Similarly, at $work I'm doing brackups onto a local backup server on the LAN, and then rsyncing the brackup tree to an offsite server for disaster recovery purposes.

There are a few gotchas when doing this, though. One is that Target::Filesystem backups default to using colons in their chunk file names on Unix-like filesystems (for backwards-compatibility reasons), while Target::Ftp and Target::Sftp ones don't. The safest thing to do is just to turn off colons altogether on Filesystem targets:

[TARGET:server_fs_home]
type = Filesystem
path = /export/brackup/nox/home
no_filename_colons = 1

Second, brackup uses a local inventory database to avoid some remote filesystem checks to improve performance, so that if you replicate a backup onto another target you also need to make a copy of the inventory database so that brackup knows which chunks are already on your new target.

The inventory database defaults to $HOME/.brackup-target-TARGETNAME.invdb (see perldoc Brackup::InventoryDatabase), so something like the following is usually sufficient:

cp $HOME/.brackup-target-OLDTARGET.invdb $HOME/.brackup-target-NEWTARGET.invdb

Third, if you want to do a restore using a brackup file (the SOURCE-DATE.brackup output file brackup produces) from a different target, you typically need to make a copy and then update the header portion for the target type and host/path details of your new target. Assuming you do that and your new target has all the same chunks, though, restores work just fine.