State Machines


Finite State machines, or FSMs, are a fantastic way to describe a system's inner workings. Interestingly, while a system's design phase may consist of a flow chart illustrating its logic and the transitions between them, the implementation throws away the flow. A FSM provides some continuity between design and implementation stages and has a number of other benefits as well.

This post reviews how the FSM concept is used within Baremetal's platform code.

The basics

This gist contains a FSM implementation for Python that was extracted out of the django-fsm app. This code will be the basis for this post, of course, the concept applies to any language.

Let's take a look at a simple state machine for a small program that collects log information for a Linux Container:

Basic State Machine

The disconnected state's job is to connect to the container and the connected state's job is to process any output the container may generate. In code, this machine may look something like this (to make things clearer the code below does not have error checking, throttling, etc.)

class Container(object):
    DISCONNECTED = 'disconnected'
    CONNECTED = 'connected'

    state_machine = State(DISCONNECTED)

    def __init__(self, container_id):
        self.state = self.DISCONNECTED
        self.container_id = container_id

    [...]

    @transition(state_machine, source=DISCONNECTED, target=CONNECTED)
    def disconnected(self):
        self.container_logs = attach_to_container(self.container_id)
    
    @transition(state_machine, source=CONNECTED, target=CONNECTED)
    def connected(self):
        for log in self.container_logs.get_logs():
            self.do_something(log)

Built-in guarantees

The state machine enforces that certain actions can only occur under the right conditions. Being in a particular state implies the conditions are met, which means cleaner code with less boilerplate. The @transition decorator enforces that a method can only run when the machine is in the correct state.

In the Container class, unless the state is CONNECTED, the connected() method will not run and instead raise a TransitionNotAllowed exception.

Structure

The state machine also provides a nice framework for defining methods that perform a specific task. Adding functionality can be accomplished by adding states to the machine and wiring the transitions appropriately. For example, let's check to see if the connection is still alive before attempting to read output (only modified code is below):

class Container(object):
    PROCESS_LOGS = 'process_logs'
    [...]

    @transition(state_machine, source=CONNECTED, target=PROCESS_LOGS)
    def connected(self):
        if not self.container_logs.is_connected:
            raise Disconnected

    @transition(state_machine, source=PROCESS_LOGS, target=CONNECTED)
    def process_logs(self):
        for log in self.container_logs.get_logs():
            self.do_something(log)

Notice process_logs() transitions back to CONNECTED, creating a loop between these two methods.

When something goes wrong

The @transition decorator handles going between states when the function call is successful. The same concept can be used to transition to another state when an Exception occurs. The connected() method raises an exception if the connection to the container is severed. At this point the state machine should return to a DISCONNECTED state. The @exception_transition decorator takes care of this. For this example reraise is set to False so that the exception is not passed up the stack:

    @exception_transition((Disconnected,), target=DISCONNECTED, reraise=False)
    @transition(state_machine, source=CONNECTED, target=PROCESS_LOGS)
    def connected(self):
        if not self.container_logs.is_connected:
            raise Disconnected

Perpetual motion

You've likely noticed that the code follows the convention of keeping the functions and the states the same name. This convention makes it easy for a loop method to call the appropriate method for the current state. The fsm.StateMachine.loop() function does the heavy lifting:

from fsm import StateMachine

class Container(StateMachine):
    [...]

    def __init__(self, container_id):
        super(Container, self).__init__(self.DISCONNECTED)
        [...]

    def run(self):
        while True:
            self.loop(target_state=self.CONNECTED)

Wrapping things up

The final state machine looks like this:

Final State Machine

And the complete code:

from fsm import StateMachine

class Container(StateMachine):
    DISCONNECTED = 'disconnected'
    CONNECTED = 'connected'
    PROCESS_LOGS = 'process_logs'

    state_machine = State(DISCONNECTED)

    def __init__(self, container_id):
        super(Container, self).__init__(state=self.DISCONNECTED)

        self.container_id = container_id

    @transition(state_machine, source=DISCONNECTED, target=CONNECTED)
    def disconnected(self):
        self.container_logs = attach_to_container(self.container_id)
    
    @exception_transition((Disconnected,), target=DISCONNECTED, reraise=False)
    @transition(state_machine, source=CONNECTED, target=PROCESS_LOGS)
    def connected(self):
        if not self.container_logs.is_connected:
            raise Disconnected

    @transition(state_machine, source=PROCESS_LOGS, target=CONNECTED)
    def process_logs(self):
        for log in self.container_logs.get_logs():
            self.do_something(log)

    def run(self):
        while True:
            self.loop(target_state=self.CONNECTED)

More on state machines

I may not be looking hard enough, but good posts on state machines are few and far between. Shopify is on the bandwagon and the Lamson Project describes how they are used within Lamson.