A stressful upgrade – and a design for a better backup system (by )

My server cluster has been having lots of outages lately - which I traced down, after much experimentation, to probably being the ethernet driver in NetBSD 3.1 (which I was running), since the machine in question (as luck would have it, the NFS/NIS server; mental note, make the other server be a NIS slave so it can run on its own...) seemed to just disappear from the network but be perfectly happy if spoken to over a serial console - but ifconfig wm0 down ; ifconfig wm0 up would then hang it.

So, since a machine with the same Ethernet interface but running NetBSD 4.0 was running fine, and I could see there'd been a lot of commits to the driver between the versions, I decided to upgrade it.

Easy enough, right? Stick in the NetBSD 4 boot CD, boot, select Upgrade. But then problems struck; my /usr and /var are on a RAIDframe mirror set, and the install kernel didn't have RAIDframe installed. So I just let it install into /usr and /var directories under the root directory, and then booted into the new system, mounted the RAID, and copied the new /usr over my old one in order to upgrade all the binaries while leaving /usr/pkg and /usr/local untouched.

Read more »

Block device snapshots in NetBSD (by )

fss is a neat feature in NetBSD that I hadn't noticed (and I suspect few people have). It's a filesystem snapshot driver.

You attach it to a mounted filesystem, and whenever a block in the underlying disk partition is written to, it copies the old block to a snapshot file. It also creates a snapshot block device, and any reads to that device look in the snapshot file first, then look in the underlying disk partition if there isn't a corresponding block in the snapshot file. End result? The snapshot block device looks like a frozen-in-time copy of the underlying disk partition, without the space and time requirements of actually copying the whole thing.

It works like this:

        -bash-3.2$ cat /home/alaric/test.txt
        THIS IS A TEST
        -bash-3.2$ sudo fssconfig fss0 / /tmp/back
        Password:
        -bash-3.2$ sudo mount /dev/fss0 /mnt
        -bash-3.2$ cat /mnt/home/alaric/test.txt
        THIS IS A TEST
        -bash-3.2$ cat > /home/alaric/test.txt
        THIS IS NOT A TEST
        -bash-3.2$ cat /mnt/home/alaric/test.txt
        THIS IS A TEST
        -bash-3.2$ cat /home/alaric/test.txt
        THIS IS NOT A TEST
        -bash-3.2$ mount
        /dev/wd0a on / type ffs (local)
        kernfs on /kern type kernfs (local)
        procfs on /proc type procfs (local)
        /dev/fss0 on /mnt type ffs (local)
        -bash-3.2$ sudo umount /mnt     
        -bash-3.2$ sudo fssconfig -lv
        fss0: /, taken 2008-11-29 01:21:23, file system internal
        fss1: not in use
        fss2: not in use
        fss3: not in use
        -bash-3.2$ sudo fssconfig -u fss0
        -bash-3.2$ cat /home/alaric/test.txt 
        THIS IS NOT A TEST
        -bash-3.2$ sudo rm /tmp/back
        override rw-------  root/wheel for '/tmp/back'? y

...taking it from me that there were no long pauses after any command in that sequence; no operation took time noticeably proportional to the 8GB size of the disk partition.

I created the snapshot file on the same filesystem I was snapshotting, which is only allowed for the ffs filesystem due to needing special locking; but it can snapshot any filesystem if I put the snapshot log on a different filesystem.

This is useful for various things:

  1. Consistent backups. As a backup, that might take hours, runs, the filesystem is changing underneath it, meaning that you can end up with a broken backup when your applications change groups of files together; the backup can end up with non-corresponding versions of things. Imagine running a backup while reinstalling a load of applications and their shared libraries, and it getting a mixture of old and new versions of libraries and binaries. Ick.
  2. Short-term onsite backups. Imagine running a snapshot every hour, from a cron job, and deleting the oldest snapshot, so you have four hourly snapshots on the go. If you do something stupid, you can go back and retrieve old versions of your stuff. Or perhaps a week's nightly snapshots. Not a backup in that it won't protect against system failures, but it's the kind of backup you can go back to when you mess something up at the filesystem level.
  3. Trialling potentially disastrous operations, like major software upgrades. Take a snapshot beforehand. If it fails, then copy the afflicted files back from the snapshot.
  4. Security auditing. Take regular nightly snapshots, then you can compare them to the live system to see what's changed, to help analyse successful breakins.

There is one caveat: a snapshot taken from a mounted filesystem will, when itself mounted, of course give you a log warning:

  /dev/fss0: file system not clean (fs_clean=4); please fsck(8)
  /dev/fss0: lost blocks 0 files 0

...and you can't fsck it since it's read-only, so you might run into trouble with that, but I think a good sync before taking the snapshot should make the window of opportunity for problems quite small.

This is really neat stuff. It's been in since NetBSD 2.0, and is still marked as experimental - so more people need to try it out, find any bugs, and otherwise confirm it works fine so it can lose that experimental tag 😉

I’m missing Scheme (by )

I've not done any Scheme programming for ages. In fact, the past few months have been quite a haze of relentless hard work; I'm liking what I'm actually doing for a living, except I've been doing rather a lot of faffing about recruitment rather than actually doing it lately.

I'm having to spend half of my week in London, and the other half working from home - but with Sarah away that half of the week doing her course, I'm working from home alone by day and looking after Jean in the evening, while having my working day bracketed by taking Jean to and from nursery, which is a half hour round trip each way. All the thrill of commuting without the fun of working somewhere different to where you sleep, or with people.

So, no programming-for-fun lately! But that can't last forever, since trying to stop my mind from going exploring for too many months in a row is always rather futile.

So I came across Ventonegro's post on and-let* and it set me thinking. The Lisp family of languages (which includes Scheme) are renowned for their macros, which are the key rationale for the minimalist syntax; without things like if holding a special place in the language, user-written macros are just as powerful as anything that comes built into the language. This lets you extend the language with features that you'd be mad to build into a language core, but which are nonetheless useful reusable constructs, such as and-let*.

As an aside, let me just explain and-let* - the name is a terse mnemonic that makes sense to Schemers and nobody else, but it's a way of compactly writing bits of code that attempt to compute something in steps, where the trail might end at any step and fall back to some default. The example Ventonegro gives is rather good:

  (define (get-session request)
    (and-let* ((cookies (request-cookies request))
               (p (assoc "session_id" cookies))
               (sid-str (cdr p))
               (sid (string->number sid-str))
               ((integer? sid))
               ((exact? sid))
               (sp (assq sid *sessions*)))
      (cdr sp)))

Which translates to:

  • If there are cookies available
  • And there is one called session_id
  • And parsing it as a session id succeeds
  • and the session id is a number
  • and that number is an integer
  • and that number is exact (eg, 3 rather than 3.0)
  • and that number is the ID of an existing session
  • ...return that session

A few languages happen to make that pattern easy to write natively by putting assignments inside an and, as Peter Bex points out, but with Lisp you don't need to rely on the that piece of luck; you can roll your own luck.

There's a whole library of useful macros and combinators (another handy higher-order programming tool) in most Scheme systems, and any your system lacks can be copied easily enough. But it occurs to me that there's very few educational resources on actually using them. I think a definite theme, if not a whole chapter, in a "practical Scheme" book would have to be the introduction and then applied use of such handy macros (and a damned good reference guide to them all), because reading the definition of and-let* failed to really fill me with inspiration for situations I'd use it in. While reading Ventonegro's example reminded me of some ugly code I'd written that could be tidied up by just using and-let*.

It's great to be able to assemble your own syntactic tools, but presenting them as one unorganised mass will just make your language seem as complex and messy as C++ and Perl combined; yet sticking only to the core base language and expecting programmers to spot their own patterns and abstract them out results in duplication of effort, and every piece of source code starting with a preamble full of generally rather general macros, neither of which are good. Rather than choosing a tradeoff between the two, as static programming language designers are forced to, we need to find a way of cataloguing such tools so that they can easily be split by category, and by priority; so the ones that are most widely useful can be learnt wholesale, and ones that are more useful in certain niches can be glanced at once then gone back to and studied if required.

The van is fixed! (by )

After the van's sad demise, it went off to Sarah's excellent uncle David to be fixed.

Anyway, he sorted it out, and I picked it up last weekend, but I've only had a moment to write about it now!

Basically, the front right wishbone had broken. It's a big triangular metal thing that attaches to the chassis on two hinges, and then attaches to the wheel at the other end, with the shock absorber coming down into the middle. As the van rides over bumps, it pivots on the hinges, regulated by the shock absorber. So it plays an important part in supporting the weight of the van.

However, knowing I'd be interested, after replacing it with a new one, David put the broken one in the van for me to take a look at!

A broken wishbone

I'd have expected something like this to be a solid casting - but no, it's two pressed sheet steel shapes welded together, making a hollow body. It looks like thick steel, 3mm or so, but near where it's cracked apart, it's more like 1mm. I presume that's due to corrosion over the years.

A closer view of the break

Here's the new one - in situ, under the van. It's the shinier, blacker, cleaner looking part, although it's already picked up quite a bit of mud.

The new wishbone

The old one is now in the little garage, awaiting cutting apart to investigate its construction and exact reason for failure, then WELDING PRACTICE!

Synchrony! (by )

Our week goes like this.

Monday morning, Sarah gets on a train to London.

Wednesday evening, at 8:30pm, I get on a train to London (Jean is with her grandfather for this bit). Then at 10:15pm Sarah gets on a train back from London, the same time as my train gets in. She gets home at about midnight, then on Saturday, I come home and we get the weekend together before she goes off again next Monday morning. This is not a very pleasant state of affairs - we're a close couple, and we miss each other keenly.

Anyway, this evening, my train into Paddington was a bit early - so I rang her as my train was pulling in at 10:09. She was in coach D of the train on platform 2, waiting for it to go; my train came in on platform 3, and I was in coach E. Platforms 2 and 3 are opposite sides of the same physical platform. So I got out, crossed the platform, walked one carriage down, and there was Sarah! She came to the door of the train, and we had an unexpected five minutes together!

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales