We all know Graminit. c defines the content related to the actual application of Python syntax rules, including some typical types. If you want to know which four typical types are involved, you can read the following articles to learn about it.
Grammar. h
Graminit. c defines the DFA (Deterministic Finite Automaton) that includes Python syntax rules. For details about DFA, see Compilers: Principles, Techniques, and Tools written by Alfred V. Aho and others. To define DFA, graminit. c references some types in grammar. h: arc, state, dfa, grammar.
The Label defines the symbol corresponding to the edge passing through the transition from the State to another State. It can be a Non-Terminal or Terminal ). The Label must be attached to one or more edges. Lb_type indicates the type of the symbol, such as The Terminator NAME, indicating a identifier, or a non-terminator stmt, representing a statement, and so on.
Lb_str indicates the content of a specific symbol. For example, label (NAME, "if") indicates that when the parser is in a certain state, if it encounters the 'if' identifier, it moves the other State. If the label is a non-Terminator, the situation should be more complex. You need to jump to another DFA corresponding to this non-Terminator. For more information, see compiler-related books.
- /* A label of an arc */
- typedef struct {
- int lb_type;
- char *lb_str;
- } label;
In Graminit. c, the DFA that includes Python syntax rules is defined. In the DFA, arc represents the arc/edge from one status to another. A_lbl represents the Label corresponding to the arc, while a_arrow records the target State of the arc. Because arc is in a certain state, you do not need to record the initial state of arc.
- /* An arc from one state to another */
- typedef struct {
- short a_lbl; /* Label of this arc */
- short a_arrow; /* State where this arc goes to */
- } arc;
State indicates the status node in DFA. Each state records the set of edges starting from the state and stores them in s_arc. Some other members s_lower, s_upper, s_accel, and s_accept record the Accelerator corresponding to the state, which will be described later. Note that the Accelerator information is not defined in graminit. c, but calculated at runtime.
- /* A state in a DFA */
- typedef struct {
- int s_narcs;
- arc *s_arc; /* Array of arcs */
- /* Optional accelerators */
- int s_lower; /* Lowest label index */
- int s_upper; /* Highest label index */
- int *s_accel; /* Accelerator */
- int s_accept; /* Nonzero for accepting state */
- } state;
The DFA structure records the starting state d_initial and the Set d_state of all States. D_first records the firstset of the non-terminator corresponding to the DFA, that is, when the terminator in the firstset is encountered, it needs to jump to the DFA. D_first will be used when the Accelerators is calculated later.
- /* A DFA */
- typedef struct {
- int d_type; /* Non-terminal this represents */
- char *d_name; /* For printing */
- http://new.51cto.com/wuyou/int d_initial; /* Initial state */
- int d_nstates;
- state *d_state; /* Array of states */
- bitset d_first;
- } dfa;
Grammar represents the entire Syntax of Python and records all DFA and all labels. G_start is the starting symbol of the Python syntax, which is generally single_input. However, the actual starting symbol can be specified when Parser is created. It can be one of single_input, file_input, and eval_input.