Introduction
The compiler preprocessing, lexical analysis and lexical analyzer are introduced, and the task and process of parsing are also mentioned.
The input of parsing is the sequence of lexical elements, then according to the grammatical representation (expansion) of the language, using the finite state machine theory, the abstract syntax tree is generated, and then the intermediate code, that is, the three address code is traversed. This section, in an experimental way, looks at the intrinsic implementation mechanism of the parser.
5.1 Experimental Description
A recursive descent analysis program is developed to realize the grammar check and structure analysis of word sequences provided by the method of lexical analysis.
The recursive Descent Analysis program is compiled by C language, and the simple language is parsed.
5.1.1 The syntax of a simple language to be analyzed
The expanded BNF is shown as follows:
⑴< program >::=begin< Statement string >end
⑵< statement String >::=< statement >{;< statement}
⑶< Statements >::=< Assignment Statements >
⑷< assignment Statements >::=ID:=< expressions >
⑸< expression >::=< Item >{+< > |-<
⑹< >::=< factor >{*< factor > |/< factor >
⑺< Factor >::=id | NUM | (< expression >)
5.1..2 Experiment Request Explanation
Enter the word string to "#" end, if it is grammatically correct sentence, then output success information, print "success", otherwise output "error".
For example:
Input begin a:=9; x:=2*3; B:=a+x End #
Output success.
Enter X:=a+b*c End #
Output error
5.2 C Language Code implementation
The core idea is, starting from the beginning of the state, according to the grammar expansion, progressive analysis of the state, until the analysis is completed, if there is a state mismatch, that is, syntax errors, stop analysis. Of course, the actual parser should have the error recovery mechanism to discover other grammatical errors. That is, multiple syntax errors are reported at a time. What needs to be explained here is that in order to achieve parsing, we must first have lexical analysis, so this code contains the contents of the previous section, the lexical Analysis section.
#include "stdio.h" #include "string.h" char prog[100],token[8],ch;
Char *rwtab[6]={' begin ', ' if ', ' then ', ' while ', ' Do ', ' end '};
int syn,p,m,n,sum;
int KK;
void factor (void);
void expression (void);
void Yucu (void);
void term (void);
void statement (void);
void Lrparser (void);
void Scaner (void);
int main (void) {p=kk=0;
printf ("\nplease input a string (end With ' # '): \ n");
do {scanf ("%c", &ch);
Prog[p++]=ch;
}while (ch!= ' # ');
p = =;
Scaner ();
Lrparser ();
Getch (); } void Lrparser (void) {if (syn==1) {Scaner (); * * Read the next word symbol */yucu ();
/* Call YUCU () function; */if (syn==6) {Scaner ();
if ((syn==0) && (kk==0)) printf ("success!\n");
else {if (kk!=1) printf ("The String haven ' t got a ' end '!\n");
kk=1;
} else {printf ("haven ' t got a ' begin '!\n ');
kk=1;
} return; } void Yucu (void) {statement (); /* Call Function statement (); */while (syn==26) {Scaner (); /* Read the next word symbol */if (SYN!=6) statement (); /* Call Function StatEment (); */} return; } void statement (void) {if (syn==10) {Scaner (); * * Read the next word symbol */if (syn==18) {Scaner (); * * Read the next word symbol */expression ();
/* Call Function statement (),/else {printf ("the sing ': = ' is wrong!\n ');
kk=1;
} else {printf ("wrong sentence!\n");
kk=1;
} return;
} void expression (void) {term (); while ((syn==13) | |
(syn==14)) {Scaner (); * * Read the next word symbol */term ();
/* Call function term ();
} void term (void) {factor (); while ((syn==15) | |
(syn==16)) {Scaner (); * * Read the next word symbol */factor (); /* Call function factor ();
*/} return; } void factor (void) {if (syn==10) | |
(syn==11))
{Scaner (); else if (syn==27) {Scaner (); * * Read the next word symbol */expression (); /* Call Function statement (); */if (syn==28) {Scaner ();
/* Read the next word sign/else {printf ("The Error on" (' \ n ');
kk=1;
}else {printf ("The Expression error!\n");
kk=1;
} return;
} void Scaner (void) {sum=0;
for (m=0;m<8;m++) token[m++]=null;
m=0;
Ch=prog[p++];
while (ch== ') ch=prog[p++]; if ((ch<= ' z ') && (ch>= ' a ') | |
((ch<= ' Z ') && (ch>= ' A ')) {while ((ch<= ' z ') && (ch>= ' a ') | | ((ch<= ' Z ') && (ch>= ' A ') | |
((ch>= ' 0 ') && (ch<= ' 9 '))
{token[m++]=ch;
Ch=prog[p++];
} p--;
syn=10;
Token[m++]= ' ";
for (n=0;n<6;n++) if (strcmp (token,rwtab[n)) ==0) {syn=n+1;
Break } else if ((ch>= ' 0 ') && (ch<= ' 9 ')) {while (ch>= ' 0 ') && (ch<= ' 9 ')) {sum=sum*10+ch-' 0 ')
;
Ch=prog[p++];
} p--;
syn=11;
else switch (CH) {case ' < ': m=0;
Ch=prog[p++];
if (ch== ' > ') {syn=21;
else if (ch== ' = ') {syn=22;
else {syn=20;
p--;
} break;
Case ' > ': m=0;
Ch=prog[p++];
if (ch== ' = ') { syn=24;
else {syn=23;
p--;
} break;
Case ': ': m=0;
Ch=prog[p++];
if (ch== ' = ') {syn=18;
else {syn=17;
p--;
} break;
Case ' + ': syn=13;
Break
Case '-': syn=14;
Break
Case ' * ': syn=15;
Break
Case '/': syn=16;
Break
Case ' (': syn=27;
Break
Case ') ': syn=28;
Break
Case ' = ': syn=25;
Break
Case '; ': syn=26;
Break
Case ' # ': syn=0;
Break
Default:syn=-1;
Break
}
}
5.3 Summary
The core of the grammar analysis is: Starting from the state, using the finite state machine theory, according to the language of the grammatical expansion of the state analysis, to get the syntax tree. The intermediate code (three address codes) is then generated to prepare for the subsequent assembly. The contents of this section are for state analysis only. But it's a great help to understand the parser. Code of the specific flow chart, the reader can draw their own, which taste, Jacuzzi can not say ...