Then, the previous article introduced tiny lexical analysis to implement language. This article describes the implementation of the tiny parser.
1 Grammar of the tiny language
Is the grammar of tiny in BNF.
The definition of grammar can be seen. Inny language has the following features:
1 programs collectively have 5 statements: If statements, REPEA statements, read statements, write syntax, and assign statements.
The 2 if statement takes end as the ending symbol, and the IF statement and the repeat statement agree to the statement sequence as the principal.
3 input/output starts with the reserved word read and write. The Read statement only reads one variable at a time, and the write statement writes only one expression at a time.
2 syntax tree structure of the tiny compiler
Tiny has two main types of structures: statements and expressions. Statements collectively have 5 classes: (if statement, repeat statement, assign statement, read statement, and read statement). Expressions collectively have 3 classes (descriptor, constant expressions, and identifier expressions). Therefore, the syntax tree node first installs whether it is a statement or an expression to classify, and then classifies it according to the kind of statement or expression.
The tree node can have a structure of up to 3 children (if only with the else part)
Statement is used only).
The statement is sorted by the same domain rather than the subdomain, i.e. the only physical connection by the father to his child is to the leftmost child. The child is connected from left to right in a standard connection table, which is called the same genus connection and is used to discriminate between parent and child connections.
The picture on the left is the same genus connection, and the picture on the right represents the parent-child connection.
A c declaration for a tiny syntax tree node such as the following:
/*********** Syntax tree for parsing ************//**************************************************/typedefenum{STMTK,EXPK} Nodekind;typedefenum{Ifk,repeatk,assignk,readk,writek} Stmtkind;typedefenum{OPK,CONSTK,IDK} Expkind;/ * Exptype is used for type checking * /typedefenum{Void,integer,boolean} Exptype, #define Maxchildren3typedef struct TREENODE {struct treeNode * Child[maxchildren]; struct TreeNode * sibling;intLineno; Nodekind Nodekind; Union {Stmtkind stmt; Expkind exp;} Kind Union {tokentype op;intValChar* NAME; } attr; Exptype type;/ * For type checking of Exps * /} TreeNode;/**************************************************/
The following draws the structure of the syntax tree. Represents a statement node with a rectangular box. Represents an expression node with a round or oval box.
Still taking the factorial of tiny language as an example, the syntax tree of tiny program is given.
{ Sample program in TINY language - computes factorial}readinteger }if0thenif0 } 1; repeat fact := fact * x; 1 until0; write fact of x }end
3 using YACC to generate tiny analysis program
Source code such as the following. Accordingly this first section gives the tiny of the BNF grammar.
%{#define Yyparser/ * distinguishes YACC output from other code files * /#include"Globals.h"#include"Util.h"#include"Scan.h"#include"Parse.h"#define YYSTYPE TreeNode *static char * savedname;/* For use in assignments * /StaticintSavedlineno;/ * ditto * /Static TreeNode * SAVEDTREE;/ * Stores syntax tree for later return * /%}%tokenIF then ELSE END REPEAT UNTIL READ WRITE%tokenID NUM%tokenASSIGN EQ LT PLUS minus times over Lparen Rparen SEMI%tokenERROR%% / * Grammar for TINY * /program:stmt_seq {savedtree = $;} ; Stmt_seq:stmt_seq SEMI stmt {yystype T = $;if(t! = NULL) { while(t->sibling! = NULL) T = t->sibling; T->sibling = $;$$= $; }Else $$= $; } | stmt {$$= $; }; stmt:if_stmt {$$= $; } | repeat_stmt {$$= $; } | assign_stmt {$$= $; } | read_stmt {$$= $; } | write_stmt {$$= $; } |Error{$$= NULL; }; if_stmt:ifExpThen Stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4; } | IFExpThen Stmt_seq ELSE stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4;$$->child[2] =$6; }; repeat_stmt:repeat Stmt_seq UNTILExp{$$= Newstmtnode (REPEATK);$$->child[0] = $;$$->child[1] =$4; }; assign_stmt:id {savedname = copystring (tokenstring); Savedlineno = Lineno; } ASSIGNExp{$$= Newstmtnode (ASSIGNK);$$->child[0] =$4;$$->attr.name = Savedname;$$->lineno = Savedlineno; }; Read_stmt:read ID {$$= Newstmtnode (READK);$$->attr.name = CopyString (tokenstring); }; write_stmt:writeExp{$$= Newstmtnode (Writek);$$->child[0] = $; } ;Exp: Simple_exp LT Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = LT; } | Simple_exp EQ Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = EQ; } | Simple_exp {$$= $; }; simple_exp:simple_exp PLUS term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = PLUS; } | Simple_exp minus Term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = minus; } | term {$$= $; }; term:term times factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = times; } | Term over factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = over; } | Factor {$$= $; }; Factor:lparenExpRparen {$$= $; } | NUM {$$= Newexpnode (CONSTK);$$->attr.val = Atoi (tokenstring); } | ID {$$= Newexpnode (IdK);$$->attr.name = CopyString (tokenstring); } |Error{$$= NULL; } ;%%intYyerror (char * message) {fprintf (listing,"Syntax error at line%d:%s\n", lineno,message); fprintf (Listing,"Current token:"); Printtoken (yychar,tokenstring); Error = TRUE;return 0;} TreeNode * Parse (void) {yyparse ();returnSavedtree;}
4 Execution procedures
Click to download the full executable code, follow the steps below to execute.
Step 1 in the command line input
$ ./build.sh
Step 2 Changes the generated Y.TAB.C code.
The YACC generated Y.TAB.C uses the Yylex () function to get the characters, and it needs to be replaced with the GetToken () function generated by Lex that we provided in the previous article.
Found in Y.TAB.C
yychar = yylex ();
Replaced by
yychar = getToken ();
Step 3 Make && Run
You can execute a program by entering the command line, such as the following command
$ make$ ./tiny.out sample.tny
The syntax tree is printed after the program executes. The relationships between nodes are identified by spaces.
TINY COMPILATION: sample.tnySyntax tree: Read: x If Op: < Const0 Id: x to: fact Const1 Repeat to: fact Op: * Id: fact Id: x to: x Op: - Id: x Const1 Op: = Id: x Const0 Write Id: fact
We can compare the pictures given in the second section, look at the syntax tree structure of the output, and experience the design of the corresponding data structure in the tiny syntax tree.
5 Summary
This paper mainly introduces the implementation process of tiny language parser. The next article will introduce the semantic analysis of tiny language, including the generation of symbol table and the algorithm of type checking.
Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.
Write yourself a compiler tiny language parser implementation