Goal-based artificial intelligence for home automation (and maybe piloting a network of autonomous killbots) (by alaric)
For a while, I've been mulling the idea of writing zmiku, a daemon that can be programmed to automatically control various kinds of systems. My application is home automation, and maybe automating the management of servers (restarting and failing over services, dealing with overload situations, gracefully handling disks being full, that sort of thing); but it occurs to me that the same basic problem also applies to controlling autonomous robots such as space probes, industrial processes, and that sort of thing. A good solution to all these problems in one would be quite useful!
You might say that this is a non-problem; normally, people would just write programs from scratch to control these kinds of things, sitting in a loop reading inputs and updating state variables and choosing what output actions to generate, but the complexity of the resulting program tends to increase rapidly as the problem complexity rises.
Rather than traditional programming languages, a better notation for such a reactive system is a state machine. The Wikipedia articles on a UML state machine diagram gives a good introduction to one version of this notation, including some discussion of ways to extend the most basic version in ways that increase its expressiveness and modularity.
I'd like to base zmiku on a textual version of the UML statecharts, but today I've had a horrible stomach ache, so been unable to do much more than lie around and think about stuff, and what my mind settled on was the interesting question of how to integrate state machines with goal-based programming, which is also useful for controlling complex systems. In a goal-based system, various goals are known to the system, each with a priority; for instance, a flying robot may have a destination demanded by the user, which the navigation system tries to fly the robot towards; but a collision-avoidance system may sometimes override the navigation system when it detect that a collision will result otherwise, with a higher-priority goal for the steering system. And when the collision has been avoided, that goal will disappear, and the earlier goal of getting to the destination will take over once more. And if the robot's batteries are running low, then flying towards a charging place (or a place where the solar panels are in sunlight) might be a higher priority than the user's chosen destination, but not a higher priority than avoiding collisions. And so on.
So I came up with a model for integrating the two, using the "scoreboard" model from artificial intelligence; giving a system a shared global state between a number of concurrent subsystems. And this blog post is the result of me writing up my scribbled notes. I'm still in a lot of stomach pain, so I'm afraid it's going to be a bit rambly 🙂
It would be possible to encode that as concurrent state machines - perhaps with the different subsystems emitting internal events to request that they provide or rescind goals, and then a state machine that listens for those events, with a state for each subsystem, representing the notion that that subsystem is the one with the currently highest-priority active goal (and another state for when no subsystem has presented a goal of that kind, which in our example would cause the robot to hold its current location). The transitions between each state would load the new "winning" goal into the navigation system by emitting an appropriate internal event that the navigation system received.
But doing this would be complex make-work, and would require duplication of information - not only would every subsystem that might present a navigational goal have to do so, the goal-management state machine would then need to be extended to support an extra state and all its transitions; and duplication like that is both boring and error-prone. Also, subsystems would need to emit internal events stating that they had raised a goal, changed a goal, or had no goal, meaning that goal-generating states would need entry and exit transitions to raise and relinquish the goal, and I don't like the notion of having to remember to pair message sends like that; it seems like more error-prone and boring duplication.
So, I decided, it would be better if goals existed as first-class objects within the system. States within state machines could be annotated with zero or more goals that are raised when the machine is in that state; they'd automatically be abandoned when the state was left. A goal would consist of a name, which indicated what kind of goal it was, and parameters representing the details of the goal (initialised by arbitrary expressions within the scope of the state, so able to refer to the values of state parameters and parameters of the event that transitioned the system into that state), and a numeric priority.
For any given kind of goal (eg, the set of currently active goals with the same name), the one with the highest priority would be the "winning goal" of that type (whether to make it possible to raise two goals with the same priority, and what to do about it (some kind of goal-merging function would need to be supplied), is an interesting question). This is part of the global state of the system, so anywhere else in the system it would be possible to refer to it in expressions. For a start, a boolean expression (as might be used in guards controlling state transitions) could check for the existence of a goal matching a pattern, which at its most basic would check for the presence of any winning goal of a given name. Secondly, in more general expressions, it would be possible to extract the parameters of the winning goal, by providing a pattern with bindable variables for some of the goal parameters, an expression to evaluate with those variables bound if the pattern matches, and a default expression to evaluate if the pattern does not match.
It might be useful to also offer versions of these two syntaxes which can search for all active goals, rather than just winning ones - the collision avoidance system, for instance, might be interested in what destination goal it's overriding, in order to choose from a number of possible collision-avoiding paths by finding the one the ends closest to the original destination. Perhaps the way to implement that would be to allow the syntaxes to supply an optional priority argument, such that all goals with priority greater than or equal to that are not searched (while still only matching the highest-priority goal remaining), so that the system generating a goal at a given priority can correctly find what goal it's overriding, even if there are many other goals at different priority levels.
As well as being used to communication between concurrent state machines within the system, goals would also form part of the input/output mechanism. Rather than generating output events of the form "turn on motor" and "turn off motor", which need to be correctly paired up as states are entered and left, we can just raise a goal named "motor state" with a parameter selecting on or off. That goal can be plumbed directly to the motor controller. Likewise, goals can be raised by external things, ranging from a user interface allowing the setting of goals to the system, to sensors that might raise oddly-worded goals such as "react to the inlet air temperature being X degrees". Goals allow the system to much more naturally deal with conditions that are continuous, rather than the point-in-time events that state machines revel in.
Given that goals are raised internally by states being entered, we can see how events can be translated into goals - but the story going the other way is not yet complete. The state of goals can be referenced in guard expressions, so goals can affect state transitions, but they can't directly drive them. To finish that functionality off, we also need a mechanism whereby a state transition can be driven by a goal pattern starting to match, or ceasing to match. That means our navigation system can listen for events of the form "destination goal has changed" and use them to trigger appropriate actions.
Goals names will normally just be symbols, but they can also be named tuples in their own right. Because goals override other goals with the same name, if we stick to just using symbols, then we have a static set of goal kinds in their entire system. But a collision avoidance system might have to be avoiding multiple obstacles at the same time, fed to it by a radar system that finds them. How to express this more dynamic form of goal? The radar system can dynamically fork instances of a tracking state machine for each obstacle it detects, responsible for tracking that obstacle's movements, and those state machines can raise a goal named "avoid(x,y,z)" - not a goal named "avoid" with arguments "(x,y,z)" (the goal may well have arguments giving the dimensions and velocity of the object). As multiple avoid goals will thus have different names, they won't override each other, leaving the system avoiding only the highest-priority obstacles!
It should also be possible for goals to imply each other, through a set of rules. Imagine a robot that can try to hide, by having a "stealth level" goal that disables certain noisy features, such as drive motors. Imagine it has a priority 10 stealth goal to not use its drive motors, because it's hiding. But if an imminent threat to the robot's survival appears, such as an incoming attack, then some system will presumably raise a high-priority goal to get away. But that's a different kind of goal (with a different name) to the "stealth level" goal, so it won't override it, even though it's at a higher priority. Now, we could make the state machine that controls the motors, which presumably listens to the stealth level goals and the motion goals, resolve this manually; but that would complicate the logic of its transition guards, and would leave us with the situation of a robot moving around while professing to a goal of not using its motors, which might cause bugs in other parts of the system that aren't expecting that confusing state of affairs.
It would be simpler if we could put in a rule that if "drive motors (at priority X)" is a winning goal, then "set the stealth level such that we can use the motors (at priority X)" is raised as a goal. That way, a high-priority move goal will automatically and atomically cause the raising of the stealth rules to allow motion.
So a zmiku installation would involve starting up the zmiku daemon, giving it a configuration file listing a number of concurrent state machines to run, along with named state machine templates to dynamically fork when requested, with much the expressiveness and feature set of UML state charts; but, being a Scheme, I'd rather handle state variables within state machines in a more functional manner than UML prefers, by parameterising states and providing expressions for the values of those parameters when a transition moves to the target state, rather than assigning to state variables. And all the extensions above to support goals would, of course, be grafted on. Input and output can be configured by providing arbitrary code to convert output events into actions, and to drive things from the state of output goals, and external systems can connect to the zmiku daemon to feed in input events and to raise and relinquish goals. And lo, complex systems can be controlled with a clear and concise domain-specific language.
But there's another direction I'd like to extend zmiku in; ideally, one would be able to run a cluster of zmiku daemons, all implementing the same state machines. Inputs can be fed to any daemon in the cluster, and output actions will be executed on any daemon listed as capable of running that action, and the addition or removal of daemons on a running cluster is handled elegantly, allowing for highly-available fault tolerance.
There are well-known algorithms for doing this, based on distributed consensus, and rely on being able to reach a quorum of daemons to be able to ensure that a single global consensus on the current state of the system holds. This means that if the network is split into two, at most one of the surviving groups of daemons can still operate; the other (or both, if neither is quorate) will "freeze". That's not very good, if a communications failure can make half of the systems in your huge industrial process stop running, even if it's quite possible to run them (in a degraded fashion, perhaps) without complete information on the state of the entire system.
I'd like to fix that, but it's a bit of an open research question. Part of the issue would be how to simply and reliably merge diverged states when a system re-connected after a partition. Having goals would possibly help somewhat with this, as avoiding the requirement to pair "on" and "off" actions by implementing them as goals, making that state explicit, would simplify merging two diverged instances of a state machine to a single final state.
Another issue would be how to express uncertainty. If the state machine raising a current goal might be influenced by events occurring on the other side of a network partition (which isolates us from the sensors driving those transitions), then that goal's validity is suspect. How do we take account of that?