We all know that when running Python, you do not need to directly use the relevant content of the Grammar file for relevant syntax analysis. If you are interested in the actual operations of running Python, you can browse our articles to get a better understanding of them. The data structure in graminit. c/graminit. h is used for syntax analysis.
As mentioned above, there is a Grammar directory under the Python source code Directory, which has only one file, Grammar, which defines all the Python syntax with BNF syntax. Take the if statement for example:
- if_stmt: 'if' test ':' suite
('elif' test ':' suite)* ['else' ':' suite]
The preceding statement can be understood as follows: The if statement is followed by up to 0 elif statements followed by the if keyword + logical expression + ':' + statement block (suite) and ends with an else statement. If_stmt on the far left indicates that this sentence defines if_stmt non-terminator), and ':' indicates the specific content corresponding to if_stmt on the right.
1. The content in ''quotation marks is the actual string, and 'if' indicates the if characters.
2. General identifiers represent non-terminologies, that is, the left side of an equation. if_stmt, test, and suite are all non-terminologies and can be extended to the sequence on the right side of the equation.
3. () brackets are atomic operators, which are enclosed in brackets and viewed as a single expression.
4. * Indicates 0 or more. For example, in if_stmt ('elif 'test': 'suite) * indicates that one if statement can contain 0 or more elif clauses.
5. + indicates one or more
However, this document is not just used as a reference. In fact, when running Python, you also need to indirectly use the content of the Grammar file for syntax analysis.
Python PGEN
The Makefile. pre. in and Parser/grammar. mak both have the following code:
- ###################################################
#######################
- # Grammar
- GRAMMAR_H= $(srcdir)/Include/graminit.h
- GRAMMAR_C= $(srcdir)/Python/graminit.c
- GRAMMAR_INPUT= $(srcdir)/Grammar/Grammar
- ###############################################
###########################
- # Parser
- PGEN= Parser/pgen$(EXE)
- POBJS= \
- Parser/acceler.o \
- Parser/grammar1.o \
- Parser/listnode.o \
- Parser/node.o \
- Parser/parser.o \
- Parser/parsetok.o \
- Parser/bitset.o \
- Parser/metagrammar.o \
- Parser/firstsets.o \
- Parser/grammar.o \
- Parser/pgen.o
- PARSER_OBJS= $(POBJS) Parser/myreadline.o Parser/tokenizer.o
- PGOBJS= \
- Objects/obmalloc.o \
- Python/mysnprintf.o \
- Parser/tokenizer_pgen.o \
- Parser/printgrammar.o \
- Parser/pgenmain.o
- PGENOBJS= $(PGENMAIN) $(POBJS) $(PGOBJS)
- ###################################################
#########################
- # Special rules for object files
- $(GRAMMAR_H) $(GRAMMAR_C): $(PGEN) $(GRAMMAR_INPUT)
- -$(PGEN) $(GRAMMAR_INPUT) $(GRAMMAR_H) $(GRAMMAR_C)
- $(PGEN): $(PGENOBJS)
- $(CC) $(OPT) $(LDFLAGS) $(PGENOBJS) $(LIBS) -o $(PGEN)
This code is used to generate pgen and then call pgen to generate graminit. h/graminit. c using Grammar as the input. PGEN is a tool for generating syntax analysis data in Python. It analyzes Grammar and generates corresponding graminit. c/graminit. h. Then, the data structure in graminit. c/graminit. h is used for syntax analysis during Python running. The specific implementation of PGEN is not covered in this article.