The principle of compiling in simple language the C implementation of a simplified parser

Last Update:2018-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

The compiler preprocessing, lexical analysis and lexical analyzer are introduced, and the task and process of parsing are also mentioned.

The input of parsing is the sequence of lexical elements, then according to the grammatical representation (expansion) of the language, using the finite state machine theory, the abstract syntax tree is generated, and then the intermediate code, that is, the three address code is traversed. This section, in an experimental way, looks at the intrinsic implementation mechanism of the parser.

5.1 Experimental Description

A recursive descent analysis program is developed to realize the grammar check and structure analysis of word sequences provided by the method of lexical analysis.

The recursive Descent Analysis program is compiled by C language, and the simple language is parsed.

5.1.1 The syntax of a simple language to be analyzed

The expanded BNF is shown as follows:

⑴< program >::=begin< Statement string >end

⑵< statement String >::=< statement >{;< statement}

⑶< Statements >::=< Assignment Statements >

⑷< assignment Statements >::=ID:=< expressions >

⑸< expression >::=< Item >{+< > |-<

⑹< >::=< factor >{*< factor > |/< factor >

⑺< Factor >::=id | NUM | (< expression >)

5.1..2 Experiment Request Explanation

Enter the word string to "#" end, if it is grammatically correct sentence, then output success information, print "success", otherwise output "error".

For example:

Input begin a:=9; x:=2*3; B:=a+x End #

Output success.

Enter X:=a+b*c End #

Output error

5.2 C Language Code implementation

The core idea is, starting from the beginning of the state, according to the grammar expansion, progressive analysis of the state, until the analysis is completed, if there is a state mismatch, that is, syntax errors, stop analysis. Of course, the actual parser should have the error recovery mechanism to discover other grammatical errors. That is, multiple syntax errors are reported at a time. What needs to be explained here is that in order to achieve parsing, we must first have lexical analysis, so this code contains the contents of the previous section, the lexical Analysis section.

#include "stdio.h" #include "string.h" char prog[100],token[8],ch;
Char *rwtab[6]={' begin ', ' if ', ' then ', ' while ', ' Do ', ' end '};
int syn,p,m,n,sum;

int KK;
void factor (void);
void expression (void);
void Yucu (void);
void term (void);
void statement (void);
void Lrparser (void);


void Scaner (void);
	int main (void) {p=kk=0;

	printf ("\nplease input a string (end With ' # '): \ n");
		do {scanf ("%c", &ch);
	Prog[p++]=ch;

	}while (ch!= ' # ');
	p = =;
	Scaner ();
	Lrparser ();
Getch ();       } void Lrparser (void) {if (syn==1) {Scaner ();     * * Read the next word symbol */yucu ();
			/* Call YUCU () function; */if (syn==6) {Scaner ();
		if ((syn==0) && (kk==0)) printf ("success!\n");
			else {if (kk!=1) printf ("The String haven ' t got a ' end '!\n");
		kk=1;
		} else {printf ("haven ' t got a ' begin '!\n ');
	kk=1;
} return;         } void Yucu (void) {statement ();          /* Call Function statement (); */while (syn==26) {Scaner ();         /* Read the next word symbol */if (SYN!=6) statement (); /* Call Function StatEment (); */} return;        } void statement (void) {if (syn==10) {Scaner ();      * * Read the next word symbol */if (syn==18) {Scaner ();      * * Read the next word symbol */expression ();
			/* Call Function statement (),/else {printf ("the sing ': = ' is wrong!\n ');
		kk=1;
		} else {printf ("wrong sentence!\n");
	kk=1;
} return;

  	} void expression (void) {term (); while ((syn==13) | |
    (syn==14))             {Scaner ();               * * Read the next word symbol */term ();
/* Call function term ();

  	} void term (void) {factor (); while ((syn==15) | |
    (syn==16))             {Scaner ();              * * Read the next word symbol */factor (); /* Call function factor ();
*/} return; } void factor (void) {if (syn==10) | |
	(syn==11))
	{Scaner ();           else if (syn==27) {Scaner ();        * * Read the next word symbol */expression ();          /* Call Function statement (); */if (syn==28) {Scaner ();
      		/* Read the next word sign/else {printf ("The Error on" (' \ n ');
     	kk=1;
    }else {printf ("The Expression error!\n");
    kk=1;
} return;

	} void Scaner (void) {sum=0;
	
	for (m=0;m<8;m++) token[m++]=null;
	m=0;
	
	Ch=prog[p++];
	
	while (ch== ') ch=prog[p++]; if ((ch<= ' z ') && (ch>= ' a ') | |
	((ch<= ' Z ') && (ch>= ' A ')) {while ((ch<= ' z ') && (ch>= ' a ') | | ((ch<= ' Z ') && (ch>= ' A ') | |
		((ch>= ' 0 ') && (ch<= ' 9 '))
			{token[m++]=ch;
		Ch=prog[p++];
		} p--;
		syn=10;
		Token[m++]= ' ";
			for (n=0;n<6;n++) if (strcmp (token,rwtab[n)) ==0) {syn=n+1;
		Break } else if ((ch>= ' 0 ') && (ch<= ' 9 ')) {while (ch>= ' 0 ') && (ch<= ' 9 ')) {sum=sum*10+ch-' 0 ')
			;
		Ch=prog[p++];
		} p--;
	syn=11;
			else switch (CH) {case ' < ': m=0;
			Ch=prog[p++];
			if (ch== ' > ') {syn=21;
			else if (ch== ' = ') {syn=22;
				else {syn=20;
			p--;
		
		} break;
			Case ' > ': m=0;
			Ch=prog[p++]; 
if (ch== ' = ') {				syn=24;
				else {syn=23;
			p--;
		
		} break;
			Case ': ': m=0;
			Ch=prog[p++];
			if (ch== ' = ') {syn=18;
				else {syn=17;
			p--;
			
		} break;
		Case ' + ': syn=13;
		
		Break
		Case '-': syn=14;
		
		Break
		Case ' * ': syn=15;
		
		Break
		Case '/': syn=16;
		
		Break
		Case ' (': syn=27;
		
		Break
		Case ') ': syn=28;
		
		Break
		Case ' = ': syn=25;
		
		Break
		Case '; ': syn=26;
		
		Break
		Case ' # ': syn=0;
		
		Break
		Default:syn=-1;
	Break
 }
}

5.3 Summary

The core of the grammar analysis is: Starting from the state, using the finite state machine theory, according to the language of the grammatical expansion of the state analysis, to get the syntax tree. The intermediate code (three address codes) is then generated to prepare for the subsequent assembly. The contents of this section are for state analysis only. But it's a great help to understand the parser. Code of the specific flow chart, the reader can draw their own, which taste, Jacuzzi can not say ...

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The principle of compiling in simple language the C implementation of a simplified parser

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The principle of compiling in simple language the C implementation of a simplified parser

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support