Ragel is a finite state machine compiler that compiles a state machine based on a regular expression into a parser for a traditional language (C,c++,d,java,ruby, etc.).
It is easy and easy to write various FSM with Ragel, and it is often used as a grammar detector.
Ragel State Machine Compiler
A C-language implementation example:
#include <stdio.h>#include<string. h>%%{machine foo; #FSM name #定义动作 action Res_true {res=1; } action Res_false {res=0; } action Res_err {res=-1; } #FSM beginning main:= ('true'0@res_true |'false'0@res_false |any @res_err); Write data for #写入 FSM; }%%intGetres (Char*pbuf) { intRes; Char*p=pbuf;//initialize p points to the array start address that needs to be handled by the FSM Char*pe=p+strlen (PBUF) +1;//Initialize PE to point to end address of P intCs//CS is used to save the FSM in the running state//Write initialization code%%Write Init; //Write execution Code%%write exec; returnRes;}intMain () {intCS; Charbuf[ the]; while(SCANF ("%s", BUF)) {printf ("res=%d\n", Getres (BUF)); } return 0;}
Compile
The above code cannot be compiled directly with GCC, it needs to be compiled into C code with Ragel, and then compiled into executable program with GCC.
Ragel-o main.c main.rlgcc -o test main.c
The example above is to convert the string "true" "false" to the form of the C language 1 0, if neither "true" nor "false" then the result is-1.
Execution results
Input
true false TrueFalse
Output
res=1res=0res=-1
Basic syntax
The multi-line FSM definition ends with%%{starting at percent. A single-line FSM definition starts at the beginning of a row with a percent.
Machine Foo; The name of the state machine.
The action defines the matching action, in which the code to execute after the match is written.
The above code has 3 actions, namely Res_true, Res_false, Res_err used to produce results.
Main: = regular expression; Represents the FSM starting point, and the match starts here first.
In the above code (' true ' 0 @res_true | ' false ' 0 @res_false | Any @res_err) indicates
(If the successful match "True\0" executes the action Res_true) or (if the successful match "False\0" executes the action Res_false) or (if a successful match any character executes the action Res_err)
Any is a Ragel keyword, similar to
keyword |
description |
any |
all characters. |
ASCII |
ASCII characters. 0~127 |
extend |
ASCII extended characters. Signed -128~127 or unsigned 0~255 |
Alpha |
letter. [ A~z A~z] |
digit |
number. [ 0~9] |
alnum |
Letters and numbers. [ A~z a~z 0~9] |
lower |
lowercase letters. [ A~z] |
Upper |
uppercase. [ A~z] |
xdigit |
16 binary number. [ 0~9 a~f a~f] |
cntrl |
control characters. 0~31 |
graph |
visual characters. [! -~] |
print |
printable characters. [-~] |
punct |
non-alphanumeric visual characters. [! -/:[email protected][-' {-~] |
space |
whitespace character. [ \t\v\f\n\r] |
zlen |
empty string. "" |
empty |
empty set. ^any |
%%write data; Write the state data required for the FSM to run anywhere, but it must be on top of%%write exec.
%%write Init; The initialization code written to the FSM, placed inside the function, requires the first definition of INT CS.
%%write exec; write the execution code of the FSM, placed inside the function, you need to define Char *p and char *pe .
Conclusion
Although using Ragel is easy to solve the usual our if else if function, and the efficiency is good, but will make the generated source code and program volume larger (19K generated 30M source), so before use still need to consider.
More detailed instructions for use can be downloaded from the user manual at Ragel state machine compiler.
Use Ragel in C language