zmiku: An automation daemon (by )

A few years ago, I wrote my own service monitoring system for my servers and networks; I did this because Nagios, the common choice, was just too complicated for my tastes and didn't cleanly fit my needs. And so, The Eye of Horus was born, and has been monitoring my servers ever since. I open-sourced it, but I've not migrated it to the new Kitten Technologies infrastructure yet, so I don't have a link.

A design goal for Horus was to limit what needed to be installed on the monitored servers; it's a Python script that you run from cron which sshes into the servers and runs shell commands directly; the results are sucked back from standard output. The configuration file format is easy to work with, and it's modular - the python script spits out a new status file listing the status of all the services that a set of CGIs uses to produce HTML reports on demand and to update rrdtool logs of measurables, and produces a list of changes to be fed to a notification system.

However, it has some rough edges - I decided to make the shell commands run on the remote servers all output a standard format of report, which means mangling the output of commands such as pidof with sed and awk in order to produce them, which is a pain to do portably. In general, support for generating different commands to get the same report on different platforms is poor, too. I never got around to implementing hysterisis in the change detectors to put a service that's rapidly failing and recovering into an "unstable" state. And it's written in Python, when I've migrated all of my new development into Scheme.

I was tinkering with the idea of a straight rewrite in Scheme, with the rough edges fixed up, when I noticed a convergence with some of my other projects beginning to form.

I've long wanted to have a system where some small lightweight computer (perhaps a Raspberry Pi), attached to the home LAN, drives speakers as a music player, streaming music from the home file server. There's off the shelf stuff to do that, but I wanted to go a little further and also provide a text-to-speech notification system; the box would also have a queue of messages. If the queue was not empty, it would pause the music (perhaps with a nice fade), emit an announcement ding sound, then play the messages in turn via a text-to-speech engine. I had previously had some success in helping my wife manage her adult ADHD by putting a cronjob on her Mac that used the "say" command to remind her when it was time to have lunch and the like, as she easily gets too absorbed in something on her laptop and forgets everything else; I thought it would be good to extend that so it worked if she wasn't near her laptop, by making it part of a house-wide music system composed of music streamers in many rooms. And it would be a good place to route notifications from systems like Horus, too. And as the house we lived in then had a very long driveway, we could have a sensor at the end of the drive speak a notification if a car entered the driveway (in the new house, we have a similar requirement for a doorbell that can be heard in distant rooms...). And so on.

But that started to lead to similar design issues as the notification system in Horus; sometimes a single event causes a lot of notifications to be generated, which spam the user when you really just want a single notification that tells them all they need to know. Horus has some domain-specific knowledge about what services depend on what listed in the configuration file, and it automatically suppresses failures that "are to be expected" given root failures, but it could be smarter (for instance, if the failure occurs after the root service has been checked and is fine but before the child services have been checked, then it will notify of the failure of all the child services, rather than noticing the suspicious trend).

And when multiple events occur in the same time span, yet are unrelated so such tricks can't be applied, some notion of priority and rate limiting need to be applied. If ten thousand notifications suddenly appear in the queue in a single second, what's the system to do? Clearly, it will start fading the music down the very instant a notificatoin arrives, but by the time it then gets to start talking a second later, it may have recevied a lot of messages; now it needs to decide what to do. Repeated messages of the same "type" should be summarised somehow. A single high-priority message should be able to cut through a slew of boring ones. And so on.

At the same time, I was looking into home automation and security systems. There you have a bunch of sensors, and actions you want to trigger (often involving yet more notifications...) in response to events. And similarly I wanted to try and automate failover actions; host failure notifications in Horus should trigger certain recovery activities - but only if the failure state lasts for more than a threshold period, to make sure expensive operations are not triggered based on transient failures.

Programming these complex "rules", be they for automation, analysing the root cause of failures from a wide range of inter-dependent service statuses, or deciding how best to summarise a slew of messages, is often complex as they deal with asynchronous inputs and the timing relationships between them, too; specialist programming models, generally based around state machines, help a great deal.

Also, having a common infrastructure for hosting such "reactive behaviour" would make it possible to build a distributed fault-tolerant implementation, which would be very useful for many of the above problems...

So, I have decided, it would be a good idea to design and build an automation daemon. It'll be a bit of software that is started (with reference to a configuration file specifying a bunch of state machines), and then sits there waiting for events. Events can be timers expiring, or external events that come from sensors; and the actions of the state machines might be to trigger events themselves, or to activate external actuators (such as the text-to-speech engine or a server reboot). And a bunch of daemons configured to cooperate would all synchronise to the same state in lock-step; if daemons drop out of the cluster, then all that will happen is that sensors and external actions attached to that daemon will become unavailable, and state machines which depend on them will be notified. In the event of a network partition separating groups of daemons, the states can diverge; a resolution mechanism will need to be specified for when they re-merge.

Having that in place would mean that building a service monitoring system would merely involve writing a sensor that ran check commands, and a standard state machine for monitoring a service (with reference to the state machines of services it depends on), generating suitable events for other consumers of service state machine to use - and human-level notification events in a standard format recognised by a general human notification handler running in the same automation daemon cluster.

The shared infrastructure would make it easy to integrated automation systems.

Now, this is a medium-term project as what I have is working OK for now and I'm focussing on Ugarit development at the moment, but I am now researching complex event processing systems to start designing a reliable distributed processing model for it. And I've chosen a name: "zmiku", the Lojban word for "automatic" or "automaton"; its goal is, in general, to automate complex systems. As I apply it to more problems, I'd like to bring in tools from the artificial intelligence toolbox to make it able to automate things in "smarter" ways; I feel that many of these techniques are currently difficult to employ in many cases where automation is required, so it would be good to make them more available.

The sorry state of keyboard interfaces (by )

Back in the Dark Ages, keyboards were simple devices. Putting too much processing power in them would raise the cost unacceptably; they were kept as simple devices that told the host computer when buttons were pressed and released, and the host computer had the job of converting that into meaningful information such as entered data or commands.

In particular, keyboards didn't even know what was printed on their buttons. They told the computer what button was pressed as a "scan code", which was loosely tied to the key's position on the keyboard. Keyboards for different alphabets had different things printed on the keys, but generated the same scan codes regardless; the computer had to be told what "keyboard map" to use to convert those scan codes into letters.

This wasn't a big deal in the days of the original PC and AT keyboard interfaces, and the later PS/2 interface, where only one keyboard could be plugged into a computer; telling the computer what kind of keyboard you had wasn't a big deal.

However, by the time USB came to be, micro-controllers were sufficiently cheap that one capable of managing the USB interface to a keyboard would easily have been able to manage its own mapping to a standard set of codes based on what the key did rather than where on the keyboard it was, avoiding the need for keyboard maps. But, they didn't. Oh no. Instead, they standardised a new set of scan codes for the positions of keys on a "standard layout", regardless of what was printed on them. And of course, keyboards that don't follow the standard layout (such as compact laptop keyboards, or ergonomic ones, or keyboard emulators such as chorders) generate the scan codes for keys based on where they would be in a standard layout, meaning that the scan codes aren't really relating to anything sensible at all.

And meaning that we still need keyboard maps on the computers.

This becomes a real pain when you have more than one keyboard, which is easily done with USB - and is increasingly becoming the norm, as a laptop (with its own keyboard) is used as a desktop computer (with a nicer, external, keyboard). For a while I was using an Apple laptop, but mainly as a VM host for a NetBSD VM. The Apple laptop had an Apple keyboard, but I plugged in a USB PC keyboard. When using Mac OS software outside my VM, the laptop had the correct keymap but the external keyboard did not; when using my VM, it was the other way around. The situation sucked.

Also, I'm sure people who work with multiple languages would love to have multiple keyboards that they can switch between easily depending on what language they're typing, without having to reconfigure their keyboard map when they do so. As a nerd, I would love to be able to buy a small keypad covered in extra function keys and have it work alongside my normal keyboard (maybe even foot pedals!). How about specialist keyboards with function keys for tasks like computed-aided design?

So, here's my proposal: keyboards should identify their buttons with Unicode strings, and a type flag (glyph or function), and an optional position flag for duplicated keys (chosen from nine options: top left, top middle, top right, center, etc; a tenth value can be used for non-duplicated keys). When you press a key with "H" printed on it, the keyboard should say "glyph H is down". When you press the left shift key, the keyboard should say "function shift (bottom left) is down".

Keys with more than one glyph printed on them, corresponding to what should happen when that key is pressed with combinations of modifier keys, can be handled by the keyboard also providing a "modifier table". If I press shift+5 in the hope of getting the % sign printed above 5 on my keyboard, the keyboard should note that a shift key is pressed, then that 5 is pressed; but the modifier table should note that the keyboard's key caps will be giving the user the impression that this combination should produce a "%".

Function keys can be given any name, including useful keys such as "help", "cut", "copy", and "paste". And you can have as many soft-bindable F-keys as you want. All these rich function names can be passed through to software as-is, letting apps bind functionality to appropriately-named keys; to make this easier, there should be a shared vocabulary of function key names to avoid synonyms cropping up.

This would be easy to implement.

This would make keyboards plug-and-play.

This would make it easy to use multiple keyboards on the same computer.

This would open up new markets for keyboards with heaps of special function keys.

This could be done in a backwards-compatible manner by making keyboards expose the old USB scancodes by default, along with a note that they can be switched into Sensible Mode if the host computer supports it.

Lobby the USB implementors forum to put this into the next version of the USB HID specification now!

Vomit-induced implementations of the 9P protocol in Chicken Scheme (by )

Last Saturday, I came down with what I suspect was Norovirus - the rest of the family (apart from the baby) having come down with it on Thursday and me having spent the past few days mopping up after them, this was probably unavoidable; although I'd tried my best by wearing a respirator when performing clean-up operations (it was also nice to not have to smell what I was clearing up...).

But, it meant I spent Monday off of work recuperating. I was too weak and exhausted to do any work from home, but I was bored senseless of just lying there on the sofa, so I decided to try and extend Chicken Scheme's 9p egg, which is a client implementation of the 9P2000 file server protocol, to also be able to act as a server.

This is something I want for Ugarit; it means that a Chicken Scheme app will be able to provide a virtual filesystem that can be mounted from a computer and used like your "real" filesystem. In particular, I want to be able to let people access their backed-up snapshots from a Ugarit vault as a live read-only filesystem, rather than needing to go in and manually "restore" their desired files back into the filesystem to access them. And it'll really come into its own when I implement archive mode, as it will make it possible to actually use the Ugarit archive seamlessly.

Unfortunately, being rather fuzzy-headed, I kept making rookie mistakes, but I eventually managed to get the core protocol implementation working. In doing so, I found out that a 9P server that puts incorrect lengths in the stat structures returned from directory reads causes 9P mounts in Linux to "hang" in a way that can't be unmounted and you need to power-cycle the machine as it won't even shut down cleanly... so be careful of that when mounting the few public 9P servers out there!

In order to test it, and as a utility for Chicken apps that would like to provide a 9P "control/status filesystem" in the manner of wmii et al, I started to write a simplified high-level virtual filesystem server library on top of the core server implementation. At the point where I made this status update to my friends in the Chicken Scheme IRC channel, directory listings weren't working (they now are), but you can see the idea - create a filesystem object from Scheme and register files and directories in it, and it appears as a live filesystem under UNIX.

Now I'm feeling a bit better today I've realised several other rookie errors I've made (not ones that cause bugs, I hope, but ones that complicated the code unnecessarily) - I'll fix those up before I submit all of my changes to the 9p egg's maintainer for merging in...

Then it'll be time to start on the Ugarit integration. THAT will be fun 🙂

Spring Cleaning (by )

I've spent more time building infrastructure than using it, I suspect. I love building infrastructure, so I've often built it because I can; however, with everything that's happened in the past six years, I've ended up struggling to maintain the infrastructure I already had. So I've had to change tack and become much more pragmatic about my infrastructure astronautics, such as getting rid of my limited company and migrating from a tightly-bound cluster to a single box for my hosting platform.

This has given me some time to tidy up and simplify the infrastructure I want to keep.

So this weekend, I got around to rebuilding the Kitten Technologies web site. This is where I publish my open-source creations; they were all version-controlled in Subversion, and I had a PHP site with some static pages, then a dynamically generated project browser that would pull out files called VERSION.txt and README.txt from the SVN repositories to build a description page, offered up tarballs of all released versions for downloading, and linked to an SVN web interface for browsing the repo. I wanted to get around to implementing ticket tracking at some point so folks could submit tickets.

However, for a while I've ached to migrate to Fossil for version control, mainly because it has integral ticket tracking and a wiki for each project, along with integral repository browsing; it provides a fully-featured project Web site, and it's a distributed VCS to boot, which is also useful. However, I wanted it to still all look like a nice integrated site for all my projects. So what I've done is to write a Fossil skin stylesheet that has my new look in it, then to build the wrapper site using the same CSS (eg, by using Fossil's names for div classes and overall page structure), based on Hyde; the CSS is actually generated from an scss master file that Hyde processes to generate the CSS as part of the static site, which the Fossil repos just refer to. My deployment script rolls the skin out to all of my repositories whenever it's updated, so they are all kept magically in sync. It still has a few rough edges (I want to improve the navigation with a consistent site-wide nav bar above the Fossil menu bar, that has the current project highlighted; this will be slightly more complex as I'll need to make the script modify the skin for each project to highlight the correct one) and I am still incapable of making non-ugly CSS, but it means that Kitten Technologies is now live on Fossil. I've a lot of projects still to migrate, but after I've done the "fiddly" ones that need some level of manual tweaking, I hope to produce a script to automate handling all the rest.

Secondly, I've been tidying up the home fileserver. It was down for some time for various reasons, which means that I've ended up with a new archive of photos, music, and PDFs forming on my laptop. I pulled our music collection out of the backups onto my laptop, too, which meant that I then had a diverging fork of that (as there was some new music on the file server since the last backup, which I later retrieved from the disks), so the re-unification of all those tens of gigabytes of files has been fiddly. But, it's now largely done, which is great; and there's now precisely one master copy of everything, and the home wiki is back up to date and pruned of outdated TODO items from several years ago.

However, this has increased my desire to implement Ugarit's archival mode. Rather than manually curating directory structures to organise my stuff (and the pain of merging changes to them), I'd love to just be able to pour files into a Ugarit library and tag them with metadata (maybe some time after the original import, if I'm in a hurry at the time), then create virtual filesystem views on that which reflect things like "All my music, organised by artist/album/title" or "All my photos, organised by who is in them, year, then event title". Combine that with the proposed Ugarit replication backend, and it will even manage replicas on my laptop as well as the home fileserver, all kept seamlessly in sync; having a home fileserver was easy when I worked from home on a desktop machine so I could just permanently mount the filesystem from the server, but it's a bit trickier with a modern laptop-based lifestyle.

Also, as the archive is already backed up into Ugarit, migrating it into a Ugarit "library" will be fast and efficient - Ugarit will automatically recognise that it already has the content of the files, and just need to upload the metadata!

I think with that and my workshop sorted out, I'm done with spring cleaning - my urgent tasks are now sorting out paperwork for my Cub pack, fixing an offline external disk on my fileserver, getting Ethernet to the workshop so I can do useful computer work in there (and move the home fileserver out of poking range of the baby, who loves to turn it off), resurrecting my salmonella install, hacking Ugarit, ring casting, and getting the foundry working so I can cast bronze, and wearable computer work! Not to mention endless minor DIY things in the house - we've got pictures to put up, dents in the plasterboard walls to fill, a flu to install for the fire, walls to repaint, ...

My new workshop (by )

I took the day off of work on my birthday, to do something I'd been dying to do since we moved in - get my workshop set up to a state where I can actually use it.

To begin with, I had a load of things to put away. The floor was covered in boxes that needed unpacking, but as soon as I'd cleared enough to get sufficient access, I put up my big shelf.

It goes on the wall above my welding bench:

Here's the wall I'd like the shelf on, above the welding bench

Drilling into masonry can be a pain, especially coarse breeze blocks like those, which are comprised of a load of tiny stones joined together with cement; the bit will tend to wander into a convenient gap between stones rather than ploughing through the wall where I want it to go. So the only thing to do is to break out my serious drill. Which I call Vera:

This is Vera

Vera is an SDS+ drill, which means it has a special chuck and takes special bits. The chuck fixing is actually designed for hammer drilling, unlike standard drill chucks, which means the drill can apply a much more significant and reliable hammering force. As such, it glides through walls like this in the way a normall drill glides through plywood.

As such, in no time I had each bracket mounted with 6x40mm screws into 10mm diameter wall plugs:

One bracket up Two brackets up

I could then life the shelf into place:

The shelf in place

However, Vera's SDS+ chuck can't drive ordinary bits (although I do have an SDS+ to ordinary chuck adapter, but Vera would really be overkill for the next step), so I used my cordless drill to predrill holes for the screws into the bottom of the shelf. I like to think of this drill as Vera's filthy little sister, as it's fast and easy. The observant will also notice that it goes up to eleven:

Vera's filthy little sister

Having done that, I screwed the shelf onto the brackets so it won't budge:

The shelf screwed to the brackets

With the shelf up I could then put even more stuff away. I've made a little guided tour movie:

There's still more to do - I need to get Ethernet cabling down there so I can get Internet access, and I need to fix the leaking flat roof, and do something about the draughty eaves and the ivy creeping in. But now that the floor is clear and things are in useful places, I can actually use the workshop, which is great.

So two days later, I performed my first project. We needed a coathook for children's coats and bags, and we found the perfect design in a shop, except it was made to hang over the top of a door rather than to be mounted on the wall.

Not a problem when you own metal working tools.

First off, I used the angle grinder to chop the bits that go over the top of the door off, then with them out of the way, went in and neatly chopped the long metal bits off close to the part we wanted:

Unwanted metal bits chopped off

Then I used a center punch to mark where I needed to drill at each end:

Punched mark where I need to drill

To begin with, I drilled a 2mm hole, as that's a lot easier to drill accurately by hand than the 5mm hole I need:

2mm hole drilled

Then I drilled it out to 5mm:

5mm hole drilled

And then the screw could fit in:

Screw in place

And it was done:

The finished product

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales