Write your own tiny compiler's implementation of the language parser

Source: Internet
Author: User

Then the previous article introduced the tiny language lexical analysis implementation, this article will introduce the tiny language syntax parser implementation.

1 Grammar of the tiny language

Is the grammar of tiny in BNF,

As can be seen from the definition of grammar, Inny language has the following characteristics:

1 Programs Total 5 statements: If statements, REPEA statements, read statements, write syntax, and assign statements.
The 2 if statement takes end as the ending symbol, and the IF statement and the repeat statement allow the statement sequence as the principal.
3 input/output starts with the reserved word read and write. The read statement reads only one variable at a time, and the write statement writes out only one expression at a time.

2 syntax tree structure of the tiny compiler

Tiny has two basic types of structures: statements and expressions. There are 5 types of statements: (if statement, repeat statement, assign statement, read statement, and read statement), there are 3 classes of expressions (calculated descriptor, constant expressions, and identifier expressions). Therefore, the syntax tree node first installs whether it is a statement or an expression to classify, and then re-classifies it according to the kind of statement or expression.
The tree node can have a structure of up to 3 children (if only with the else part)
Statement is used only). The statement is sorted by the same domain rather than the subdomain, i.e. the only physical connection by the father to his child is to the leftmost child. The child is connected from left to right in a standard connection table, which is called the same genus connection and is used to differentiate parent-child connections.

The picture on the left is the same genus connection, and the picture on the right represents the parent-child connection.

The C declaration of a tiny syntax tree node is as follows:

/*********** Syntax tree for parsing ************//**************************************************/typedefenum{STMTK,EXPK} Nodekind;typedefenum{Ifk,repeatk,assignk,readk,writek} Stmtkind;typedefenum{OPK,CONSTK,IDK} Expkind;/ * Exptype is used for type checking * /typedefenum{Void,integer,boolean} Exptype, #define Maxchildren3typedef struct TREENODE {struct treeNode * Child[maxchildren]; struct TreeNode * sibling;intLineno;     Nodekind Nodekind; Union {Stmtkind stmt; Expkind exp;}     Kind Union {tokentype op;intValChar* NAME;     } attr; Exptype type;/ * For type checking of Exps * /} TreeNode;/**************************************************/

The structure of the syntax tree is shown below, with a rectangular box representing the statement node, and a round or oval box representing the expression node. Still taking the factorial of tiny language as an example, the syntax tree of tiny program is given.

{ Sample program  in TINY language -  computes factorial}readinteger }if0thenif0 }  1;  repeat    fact := fact * x;    1  until0;  write fact  of x }end

3 using YACC to generate tiny analysis program

The source code is as follows, corresponding to the tiny's BNF grammar given in this first section.

%{#define Yyparser/ * distinguishes YACC output from other code files * /#include"Globals.h"#include"Util.h"#include"Scan.h"#include"Parse.h"#define YYSTYPE TreeNode *static char * savedname;/* For use in assignments * /StaticintSavedlineno;/ * ditto * /Static TreeNode * SAVEDTREE;/ * Stores syntax tree for later return * /%}%tokenIF then ELSE END REPEAT UNTIL READ WRITE%tokenID NUM%tokenASSIGN EQ LT PLUS minus times over Lparen Rparen SEMI%tokenERROR%% / * Grammar for TINY * /program:stmt_seq {savedtree = $;} ; Stmt_seq:stmt_seq SEMI stmt {yystype T = $;if(t! = NULL) { while(t->sibling! = NULL) T = t->sibling; T->sibling = $;$$= $; }Else $$= $; }            | stmt {$$= $; }; stmt:if_stmt {$$= $; }            | repeat_stmt {$$= $; }            | assign_stmt {$$= $; }            | read_stmt {$$= $; }            | write_stmt {$$= $; }            |Error{$$= NULL; }; if_stmt:ifExpThen Stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4; }            | IFExpThen Stmt_seq ELSE stmt_seq END {$$= Newstmtnode (IfK);$$->child[0] = $;$$->child[1] =$4;$$->child[2] =$6; }; repeat_stmt:repeat Stmt_seq UNTILExp{$$= Newstmtnode (REPEATK);$$->child[0] = $;$$->child[1] =$4;                   }; assign_stmt:id {savedname = copystring (tokenstring); Savedlineno = Lineno; } ASSIGNExp{$$= Newstmtnode (ASSIGNK);$$->child[0] =$4;$$->attr.name = Savedname;$$->lineno = Savedlineno; }; Read_stmt:read ID {$$= Newstmtnode (READK);$$->attr.name = CopyString (tokenstring); }; write_stmt:writeExp{$$= Newstmtnode (Writek);$$->child[0] = $; }            ;Exp: Simple_exp LT Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = LT; }            | Simple_exp EQ Simple_exp {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = EQ; }            | Simple_exp {$$= $; }; simple_exp:simple_exp PLUS term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = PLUS; }            | Simple_exp minus Term {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = minus; }             | term {$$= $; }; term:term times factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = times; }            | Term over factor {$$= Newexpnode (OpK);$$->child[0] = $;$$->child[1] = $;$$->attr.op = over; }            | Factor {$$= $; }; Factor:lparenExpRparen {$$= $; }            | NUM {$$= Newexpnode (CONSTK);$$->attr.val = Atoi (tokenstring); }            | ID {$$= Newexpnode (IdK);$$->attr.name = CopyString (tokenstring); }            |Error{$$= NULL; }            ;%%intYyerror (char * message) {fprintf (listing,"Syntax error at line%d:%s\n", lineno,message); fprintf (Listing,"Current token:");  Printtoken (yychar,tokenstring); Error = TRUE;return 0;} TreeNode * Parse (void) {yyparse ();returnSavedtree;}
4 Running the program

Click to download the full operational code and run it as per the following steps.
Step 1 in the command line input

$ ./build.sh

Step 2 modifies the generated Y.TAB.C code.
The YACC generated Y.TAB.C uses the Yylex () function to get the characters and needs to be replaced with the GetToken () function generated by Lex that we provided in the previous article. Found in Y.TAB.C

yychar = yylex ();

Replaced by

yychar = getToken ();

Step 3 Make && Run
Execute the program by entering the following command on the command line

$ make$ ./tiny.out sample.tny

The syntax tree is printed after the program runs, and the relationships between the nodes are identified by spaces.

TINY COMPILATION: sample.tnySyntax tree:  Read: x  If    Op: <      Const0      Id: x    to: fact      Const1    Repeat      to: fact        Op: *          Id: fact          Id: x      to: x        Op: -          Id: x          Const1      Op: =        Id: x        Const0    Write      Id: fact

You can look at the image given in the second section, view the syntax tree structure of the output, and experience the design of the corresponding data structure in the tiny syntax tree.

5 Summary

This paper mainly introduces the implementation process of tiny language parser. The next article will introduce the semantic analysis of tiny language, mainly including the generation of symbol table and the algorithm of type checking.

Write your own tiny compiler's implementation of the language parser

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.