Concepts of state machines and tutorials on using state machines in Python
What is a state machine?
An exclusive description of a state machine is a directed graph, which consists of a group of nodes and a group of corresponding transfer functions. The State Machine "runs" by responding to a series of events ". Each event is under the control of the transfer function of the "current" node. The function range is a subset of the node. The function returns the "Next" (maybe the same) node. At least one of these nodes must be in the final state. When the final state is reached, the state machine stops.
However, an abstract mathematical description (just like what I just gave) does not really explain under what circumstances the use of a state machine can solve practical programming problems. Another strategy is to define the state machine as a mandatory programming language, where the node is also the source code line. From a practical point of view, although this definition is accurate, it is as practical as the first one. (This may not happen for descriptive, functional, or constraint-based languages such as Haskell, Scheme, or Prolog .)
Let's try to use examples that are more suitable for actual tasks. Logically, each rule expression is equivalent to a state machine, and the syntax analyzer of each rule expression implements this state machine. In fact, most programmers did not really consider this point when writing state machines.
In the following example, we will study the real exploratory definition of the state machine. Generally, we have different methods to respond to a limited number of events. In some cases, the response only depends on the event itself. However, in other cases, proper operations depend on previous events.
The state machine discussed in this article is an advanced machine. Its purpose is to demonstrate programming solutions for a type of problem. If it is necessary to discuss programming issues by the type of response event behavior, your solution may be an explicit state machine.
Text processing state machine
A programming problem that most likely calls an explicit state machine involves processing text files. Processing text files usually includes reading information units (usually called characters or rows), and then performing appropriate operations on the units that have just been read. In some cases, this process is "stateless" (that is, each such unit contains enough information to correctly determine what operations to perform ). In other cases, even if the text file is not completely stateless, the data has only a limited context (for example, the operation depends on more information than the row number ). However, among other common text processing problems, the input file is very "stateful. The meaning of each piece of data depends on the character string before it (maybe the character string after it ). Reports, mainframe data input, readable text, programming source files, and other types of text files are stateful. A simple example is a line of code that may appear in the Python source file:
myObject = SomeClass(this, that, other)
This line indicates that if there are exactly the following rows around this line, part of the content is different:
"""How to use SomeClass:myObject = SomeClass(this, that, other)"""
We should know that we are in the "Block reference" status to determine that this line of code is partial comments rather than Python operations.
When do not use the state machine
When you start writing a processor task for any stateful text file, ask yourself what type of input items you want to find in the file. Each type of input item is a State candidate item. There are several types. If the number is large or uncertain, the state machine may not be the correct solution. (In this case, some database solutions may be more suitable .)
Consider whether you need to use a state machine. In many cases, it is best to start with a simpler method. You may find that even if a text file is stateful, there is a simple way to read it in blocks (each of which is a type of input value ). In fact, in a single State block, it is necessary to implement a state machine only when the transfer between text types requires Content-based computing.
The following simple example shows how to use a state machine. Consider the two rules used to divide a column of numbers into several digits. In the first rule, zero in the list indicates the interruption between blocks. In the second rule, when the total number of elements in a block exceeds 100, a break occurs between blocks. Because it uses the accumulators variable to determine whether the threshold value has been reached, you cannot view the boundary of the sublist "immediately. Therefore, the second rule may be more suitable for a mechanism similar to a state machine.
A Windows-style. ini file is an example of a text file that is slightly stateful but not suitable for processing with a state machine. This type of file includes the section header, comments, and many assignments. For example:
; set the colorscheme and userlevel[colorscheme]background=redforeground=bluetitle=green[userlevel]login=2title=1
Our example has no actual meaning, but it shows some interesting features of the. ini format.
In a sense, the type of each row is determined by its first character (may be a semicolon, left curly braces, or letter ).
From another perspective, this format is "stateful", because the keyword "title" indicates that if it appears in each section, it has independent content.
You can write a text processor program with COLORSCHEME status and USERLEVEL status. This program still processes assignments for each status. However, this does not seem to be the correct way to solve this problem. For example, you can use Python code to create only natural blocks in this text file, for example:
Process the Python code of the. INI file.
import stringtxt = open( 'hypothetical.ini').read()sects = string.split(txt, '[') for sect in sects: # do something with sect, like get its name # (the stuff up to ']') and read its assignments
Or, if you want to, you can use a single current_section variable to determine the location:
Python code for processing. INI files
for line in open( 'hypothetical.ini').readlines(): if line[0] == '[': current_section = line(1:-2) elif line[0] == ';': pass # ignore comments else : apply_value(current_section, line)
When to use a State Machine
Now, we have decided that if the text file is "too simple", we will not use the state machine. Let's study the situation where the state machine needs to be used. A recent article in this column discusses the utility Txt2Html, which converts "intelligent ASCII" (including this article) to HTML. Let's repeat it.
"Intelligent ASCII" is a text format that uses interval conventions to differentiate the types of text blocks, such as headers, regular texts, quotes, and code samples. Although readers or authors can easily view and analyze the transfer between these text block types, there is no simple way for computers to split "intelligent ASCII" files into text blocks. Unlike the. ini file example, the text block type can appear in any order. In any case, there is no single separator to separate blocks (empty lines are usually separated by text blocks, but the empty lines in the sample code do not necessarily end the sample code, text blocks do not need to be separated by blank lines ). Since each text block needs to be reformatted in different ways to generate correct HTML output, the state machine seems to be a natural solution.
The general functions of Txt2Html reader are as follows:
- Start in the initial state.
- Read a row of input.
- Transfers the row to a new status based on the input and the current status, or processes the row as appropriate.
This example is about the simplest situation you will encounter, but it illustrates the following modes we have described:
A simple state machine input loop in Python
global state, blocks, bl_num, newblock#-- Initialize the globalsstate = "HEADER"blocks = [""]bl_num = 0newblock = 1 for line in fhin.readlines(): if state == "HEADER": # blank line means new block of header if blankln.match(line): newblock = 1 elif textln.match(line): startText(line) elif codeln.match(line): startCode(line) else : if newblock: startHead(line) else : blocks[bl_num] = blocks[bl_num] + line elif state == "TEXT": # blank line means new block of text if blankln.match(line): newblock = 1 elif headln.match(line): startHead(line) elif codeln.match(line): startCode(line) else : if newblock: startText(line) else : blocks[bl_num] = blocks[bl_num] + line elif state == "CODE": # blank line does not change state if blankln.match(line): blocks[bl_num] = blocks[bl_num] + line elif headln.match(line): startHead(line) elif textln.match(line): startText(line) else : blocks[bl_num] = blocks[bl_num] + line else : raise ValueError, "unexpected input block state: "+state
You can use Txt2Html to download the source file from which the code is extracted (see references ). Note: The variable state is declared as global and its value is changed in functions (such as startText. Transfer conditions, such as textln. match (), are rule expression patterns, but they may also be custom functions. In fact, formatting will be performed in the program in the future. The state machine only analyzes text files into blocks with labels in the blocks list.
Abstract state machine class
It is easy to use Python to implement an abstract state machine in forms and functions. This makes the state machine model of the program more prominent than the simple condition block in the previous example (from the beginning, the condition is no different from other conditions ). In addition, the following classes and their associated handlers perform well in isolation. In many cases, this improves encapsulation and readability.
File: statemachine. py
from string import upper class StateMachine : def __init__ (self): self.handlers = {} self.startState = None self.endStates = [] def add_state (self, name, handler, end_state=0): name = upper(name) self.handlers[name] = handler if end_state: self.endStates.append(name) def set_start (self, name): self.startState = upper(name) def run (self, cargo): try : handler = self.handlers[self.startState] except : raise "InitializationError", "must call .set_start() before .run()" if not self.endStates: raise "InitializationError", "at least one state must be an end_state" while 1: (newState, cargo) = handler(cargo) if upper(newState) in self.endStates: break else : handler = self.handlers[upper(newState)]
The StateMachine class is actually required by the abstract state machine. Because it is so simple to pass function objects using Python, compared with similar classes in other languages, this class requires a very small number of rows.
To actually use the StateMachine class, you need to create some handlers for each state to be used. The handler must conform to the mode. It processes events cyclically until it is transferred to another State. At this time, the handler should pass back a byte group (which includes the new State name and any cargo required by the new State handler.
Using cargo as a variable in the StateMachine class encapsulates the data required by the State handler (the State handler does not have to call its cargo variable ). The status handler uses cargo to pass the content required by the next handler, so the new handler can take over the legacy work of the previous handler. Cargo usually includes a file handle, which allows the next handler to read more data after the previous handler is stopped. Cargo may also be a database connection, a complex class instance, or a list of several items.
Now, let's study the test sample. In this example (outlined in the following code example), cargo is a number that continuously sends feedback to the iteration function. As long as val is within a certain range, the next value of val is always math_func (val ). Once the function returns a value out of the range, the value will be transferred to another handler, or the state machine will exit after it calls a final state handler that does nothing. The example illustrates one thing: events do not have to be input events. It can also be a computing event (this case is rare ). The difference between state handlers is that they use different tags when outputting the events they process. This function is relatively simple and there is no need to use a state machine. But it illustrates the concept well. Code may be easier to understand than explanations!
File: statemachine_test.py
from statemachine import StateMachine def ones_counter (val): print "ONES State: ", while 1: if val <= 0 or val >= 30: newState = "Out_of_Range" ; break elif 20 <= val < 30: newState = "TWENTIES"; break elif 10 <= val < 20: newState = "TENS"; break else : print " @ %2.1f+" % val, val = math_func(val) print " >>" return (newState, val) def tens_counter (val): print "TENS State: ", while 1: if val <= 0 or val >= 30: newState = "Out_of_Range"; break elif 1 <= val < 10: newState = "ONES"; break elif 20 <= val < 30: newState = "TWENTIES"; break else : print " #%2.1f+" % val, val = math_func(val) print " >>" return (newState, val) def twenties_counter (val): print "TWENTIES State:", while 1: if val <= 0 or val >= 30: newState = "Out_of_Range"; break elif 1 <= val < 10: newState = "ONES"; break elif 10 <= val < 20: newState = "TENS"; break else : print " *%2.1f+" % val, val = math_func(val) print " >>" return (newState, val) def math_func (n): from math import sin return abs(sin(n))*31 if __name__== "__main__": m = StateMachine() m.add_state( "ONES", ones_counter) m.add_state( "TENS", tens_counter) m.add_state( "TWENTIES", twenties_counter) m.add_state( "OUT_OF_RANGE", None, end_state=1) m.set_start( "ONES") m.run(1)