Have you ever encountered intermediate results obtained by syntax analysis in computer related operations, such as CST? Do you want to know the actual application of CST? The following is a detailed introduction to Python CST's practical applications and related code. forget to get what you want.
Similar to the AST, Python CST is the intermediate result obtained by syntax analysis. Their difference is that CST directly corresponds to the matching process of syntax analysis, which is directly generated and contains a large amount of redundant information. The AST skips the redundant information in the middle and directly corresponds to the actual semantics, that is, the analysis result. Use examples to clarify the following:
Suppose there is such an expression,
Python CST is like this:-> indicates from parent node to child node)
- file_input -> stmt -> simple_stmt -> small_stmt ->
expr_stmt -> testlist -> test ->or_test ->and_test
->not_test -> comparison -> expr -> xor_expr ->
and_expr -> shift_expr -> arith_expr -> term ->
- factor -> power -> atom -> (NAME, “a”)
The AST is:
- (stmt_ty, expr_kind) -> (expr_ty, name_kind) ->(“a”)
We can see that CST expresses the entire process of analyzing a, from file_input to the final NAME. Every step of derivation has become a node of the tree, and most of the information can be said to be useless. The AST structure is much simpler and more straightforward. It directly indicates that expression a is an expression statement and a is assumed to be a separate statement. The content is a identifier and the value is "". Python syntax analysis generates Python CST instead of AST. Later, Python will call PyAst_FromNode to convert CST to AST.
The CST Node is called a node, and its structure is defined in Node. h:
- typedef struct _node {
- short n_type;
- char *n_str;
- int n_lineno;
- int n_col_offset;
- int n_nchildren;
- struct _node *n_child;
- } node;
- Field
- Description
- n_type
Node type. The Terminator is defined in token. h, not in graminit. h.
N_str
Content of the string corresponding to the node
N_lineno
Corresponding row number
N_col_offset
Column number
N_nchildren
Number of subnodes
N_child
Subnode array for dynamic memory allocation
Python provides the following functions/macros to operate CST, which are also defined in node. h:
- PyAPI_FUNC(node *) PyNode_New(int type);
- PyAPI_FUNC(int) PyNode_AddChild(node *n, int type,
- char *str, int lineno, int col_offset);
- PyAPI_FUNC(void) PyNode_Free(node *n);
- /* Node access functions */
- #define NCH(n) ((n)->n_nchildren)
- #define CHILD(n, i) (&(n)->n_child[i])
- #define RCHILD(n, i) (CHILD(n, NCH(n) + i))
- #define TYPE(n) ((n)->n_type)
- #define STR(n) ((n)->n_str)
- /* Assert that the type of a node is what we expect */
- #define REQ(n, type) assert(TYPE(n) == (type))
- PyAPI_FUNC(void) PyNode_ListTree(node *);
The above is similar to Python CST and AST, which are the intermediate results obtained by syntax analysis. Their difference is that CST directly corresponds to the matching process of syntax analysis, which is directly generated and contains a large amount of redundant information. The AST skips the redundant information in the middle and directly corresponds to the actual semantics, that is, the analysis result. If you use examples to describe the relevant content, you will be rewarded.