Then the previous article introduced the tiny language lexical analysis implementation, this article will introduce the tiny language syntax parser implementation.
1 Grammar of the tiny language
Is the grammar of tiny in BNF,
As can be seen from the definition of grammar, Inny language has the following characteristics:
1 Programs Total 5 statements: If statements, REPEA statements, read statements, write syntax, and assign statements.
The 2 if statement takes end as the ending symbol, and the IF statement and the repeat statement allow the statement sequence as the principal.
3 input/output starts with the reserved word read and write. The read statement reads only one variable at a time, and the write statement writes out only one expression at a time.
2 syntax tree structure of the tiny compiler
Tiny has two basic types of structures: statements and expressions. There are 5 types of statements: (if statement, repeat statement, assign statement, read statement, and read statement), there are 3 classes of expressions (calculated descriptor, constant expressions, and identifier expressions). Therefore, the syntax tree node first installs whether it is a statement or an expression to classify, and then re-classifies it according to the kind of statement or expression.
The tree node can have a structure of up to 3 children (if only with the else part)
Statement is used only). The statement is sorted by the same domain rather than the subdomain, i.e. the only physical connection by the father to his child is to the leftmost child. The child is connected from left to right in a standard connection table, which is called the same genus connection and is used to differentiate parent-child connections.
The picture on the left is the same genus connection, and the picture on the right represents the parent-child connection.
The C declaration of a tiny syntax tree node is as follows:
/*********** Syntax tree for parsing ************//**************************************************/typedefenum{STMTK,EXPK} Nodekind;typedefenum{Ifk,repeatk,assignk,readk,writek} Stmtkind;typedefenum{OPK,CONSTK,IDK} Expkind;/ * Exptype is used for type checking * /typedefenum{Void,integer,boolean} Exptype, #define Maxchildren3typedef struct TREENODE {struct treeNode * Child[maxchildren]; struct TreeNode * sibling;intLineno; Nodekind Nodekind; Union {Stmtkind stmt; Expkind exp;} Kind Union {tokentype op;intValChar* NAME; } attr; Exptype type;/ * For type checking of Exps * /} TreeNode;/**************************************************/
The structure of the syntax tree is shown below, with a rectangular box representing the statement node, and a round or oval box representing the expression node. Still taking the factorial of tiny language as an example, the syntax tree of tiny program is given.
{ Sample program in TINY language - computes factorial}readinteger }if0thenif0 } 1; repeat fact := fact * x; 1 until0; write fact of x }end
3 using YACC to generate tiny analysis program
The source code is as follows, corresponding to the tiny's BNF grammar given in this first section.
%{#define Yyparser/ * distinguishes YACC output from other code files * /#include"Globals.h"#include"Util.h"#include"Scan.h"#include"Parse.h"#define YYSTYPE TreeNode *static char * savedname;/* For use in assignments * /StaticintSavedlineno;/ * ditto * /Static TreeNode * SAVEDTREE;/ * Stores syntax tree for later return * /%}%tokenIF then ELSE END REPEAT UNTIL READ WRITE%tokenID NUM%tokenASSIGN EQ LT PLUS minus times over Lparen Rparen SEMI%tokenERROR%% / * Grammar for TINY * /program:stmt_seq {savedtree = $;} ; Stmt_seq:stmt_seq SEMI stmt {yystype T = $;if(t! = NULL) { while(t->sibling! = NULL) T = t->sibling; T->sibling = $;$$= $; }Else $$= $; } | stmt {$$= $; }; stmt:if_stmt {$$= $; } | repeat_stmt {$$= $; } | assign_stmt {$$= $; } | read_stmt {$$= $; } | write_stmt {$$= $; } |Error{$$= NULL; }; if_stmt:ifExpThen Stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4; } | IFExpThen Stmt_seq ELSE stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4;$$->child[2] =$6; }; repeat_stmt:repeat Stmt_seq UNTILExp{$$= Newstmtnode (REPEATK);$$->child[0] = $;$$->child[1] =$4; }; assign_stmt:id {savedname = copystring (tokenstring); Savedlineno = Lineno; } ASSIGNExp{$$= Newstmtnode (ASSIGNK);$$->child[0] =$4;$$->attr.name = Savedname;$$->lineno = Savedlineno; }; Read_stmt:read ID {$$= Newstmtnode (READK);$$->attr.name = CopyString (tokenstring); }; write_stmt:writeExp{$$= Newstmtnode (Writek);$$->child[0] = $; } ;Exp: Simple_exp LT Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = LT; } | Simple_exp EQ Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = EQ; } | Simple_exp {$$= $; }; simple_exp:simple_exp PLUS term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = PLUS; } | Simple_exp minus Term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = minus; } | term {$$= $; }; term:term times factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = times; } | Term over factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = over; } | Factor {$$= $; }; Factor:lparenExpRparen {$$= $; } | NUM {$$= Newexpnode (CONSTK);$$->attr.val = Atoi (tokenstring); } | ID {$$= Newexpnode (IdK);$$->attr.name = CopyString (tokenstring); } |Error{$$= NULL; } ;%%intYyerror (char * message) {fprintf (listing,"Syntax error at line%d:%s\n", lineno,message); fprintf (Listing,"Current token:"); Printtoken (yychar,tokenstring); Error = TRUE;return 0;} TreeNode * Parse (void) {yyparse ();returnSavedtree;}
4 Running the program
Click to download the full operational code and run it as per the following steps.
Step 1 in the command line input
$ ./build.sh
Step 2 modifies the generated Y.TAB.C code.
The YACC generated Y.TAB.C uses the Yylex () function to get the characters and needs to be replaced with the GetToken () function generated by Lex that we provided in the previous article. Found in Y.TAB.C
yychar = yylex ();
Replaced by
yychar = getToken ();
Step 3 Make && Run
Execute the program by entering the following command on the command line
$ make$ ./tiny.out sample.tny
The syntax tree is printed after the program runs, and the relationships between the nodes are identified by spaces.
TINY COMPILATION: sample.tnySyntax tree: Read: x If Op: < Const0 Id: x to: fact Const1 Repeat to: fact Op: * Id: fact Id: x to: x Op: - Id: x Const1 Op: = Id: x Const0 Write Id: fact
You can look at the image given in the second section, view the syntax tree structure of the output, and experience the design of the corresponding data structure in the tiny syntax tree.
5 Summary
This paper mainly introduces the implementation process of tiny language parser. The next article will introduce the semantic analysis of tiny language, mainly including the generation of symbol table and the algorithm of type checking.
Write your own tiny compiler's implementation of the language parser