This semester compiling the principle of a big job, my topic is arithmetic expression of the word French and French semantic analysis, at that time because of the comparative slag, only the recursive descent method was analyzed.
First, the user enters an arithmetic expression, where an arithmetic expression can contain basic operators, parentheses, numbers, and user-defined variables.
Lexical analysis, check the correctness of the word variables, grammar analysis, check the arithmetic expression syntax is correct and output to generate syntax tree; Semantic analysis, output four-yuan expression.
Eventually:
For example, enter:
Lexical analysis results:
Syntax Analysis results:
Semantic Analysis Results:
the composition syntax of an arithmetic expression is as follows :
unsigned integer = ( number) {number}
identifier 〉= ( letter ) {Letter 〉|〈 number}
expression 〉= [+|-] Item {add and subtract operator 〉〈}
Item 〉= ( factor ) {multiplication operator 〉〈 factor}
Factor 〉= identifier 〉|〈 unsigned integer 〉|' (' expression ') '
addition and subtraction operator 〉= +|-
The multiplication operator 〉= *|/
Attention:
# identifiers start with letters and contain only letters and numbers
# letters contain uppercase and lowercase letters
Symbolic grammar means:
Indentifer: identifier digit: number M: Expression
Items: T factor:F
+e|-e|, M- E
E-e+t| E-t| T
T-t*f| T/f| F
F-(E) |indentifer|digit
Eliminate left recursion and improve grammar:
1. M-+e|-e| E
2. te~ E-C
3. e~-+te~|-te~|&
4. T-ft~
5. t~-*ft~|/ft~|&
6. F-(E) |indentifer|digit
1. Lexical Analysis
Word Category Definition
Operator: (,), +,-, *,/ Category code:3
identifier: Letter {Letter 〉|〈 number} Category code: 1
unsigned integer: number {number} Category code:2
Design Ideas
accept the input string in turn, depending on the DFA to determine the type of words, the words and symbols inside the symbol table dictionary, the word into the word stack
1. If a letter is received as an identifier, then always accept letters and numbers until non-alphanumeric and non-alphabetic symbols appear .
2. If a number is received after the operator, the description is an unsigned integer that is accepted until a non-numeric symbol appears
3. If the operator is accepted, continue processing
Simple to draw DFA:
Data Structure
symbol table:dic={}
Word stack:table=[] input data
2. Grammatical analysis
The corresponding handlers are designed for each non-terminating symbol in the grammar, and the handlers are designed according to the order of the grammar acceptance, and each time the program chooses one of the grammars, it is saved and printed, and if all the words in the word stack are accepted, the syntax is correct and the other case is the syntax error.
Data Structure
dic={} # symbol table
table=[] # Word stack
wenfa=[] # string Grammar
3. Semantic analysis and intermediate code generation
Design Ideas
Here I still use the idea of recursive descent, I did not use the results of the analysis of the grammar, but using the results of lexical analysis for each non-terminating symbol design corresponding program, when the result is enough to generate a four-dollar formula, output it. Assigns the result to a temporary variable, passed to the parent item.
Data Structure
table=[] # Word stack
Siyuan=[] # four-dollar
Source:
#-*-coding=utf-8-*-letter= ' abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz ' number= ' 0123456789 ' operater= ' + -*/() ' dic={} #符号表table =[] #单词栈wenfa =[] #字符串文法siyuan =[] #四元式 ##################################### Lexical analysis ##################################### #def Cifa (string): #词法分析 print ' m=0 state=0 #1: For the identifier 2: for the number string 3: for the operation character for I in Range (len (string)): If String[i] in Operater: #如果是运算符 if state==1: #state =1 indicates that the preceding For the identifier print STRING[M:I], ' is the identifier, type code: 1 ' dic[string[m:i]]=1 table.append (string[m: I]) elif state==2: #state =2 indicates that it precedes the digital print string[m:i], ' is a number, type code: 2 ' Dic[strin g[m:i]]=2 table.append (string[m:i]) m=i+1 state=3 print string[i], ' is operator, type Code: 3 ' dic[string[i]]=3 table.append (string[i]) elif String[i] in number: #如果是数字 If i==m: #判断此时的数字是否为整数的第一个数The word, if it changes the status to unsigned integer state=2 elif string[i] in letter: #如果是字母 if state==2: #判断此时的状态, if state= 2 indicates that the status is an unsigned integer, and the integer cannot contain letters, so error print ' Lexical analysis detected errors, the number string cannot contain the letter ' exit (0) if i==m: # Determines whether the letter at this time is the first letter of the identifier, if the state is changed to an identifier state=1 else: #当输入的字符均不符合以上判断, it is an illegal character, so error print ' Lexical analysis detects illegal characters ' exit (0) If state==1: #当字符串检查完后, if the last part of the string is an identifier, print it out to print string[m:], ' Yes identifier, type code: 3 ' dic[string[m:]]=1 table.append (string[m:]) elif state==2: #若字符串最后部分为无符号整数, print it out Print String[m:], ' is unsigned integer, type code: 2 ' dic[string[m:]]=2 table.append (string[m:]) table.append (' # ') print ' Word Fu Yi: ', table, ' \ n lexical correct ' ################################### grammatical analysis ##################################### ' basic grammar: +e|-e|, M- *ft~|/ft~|&f, ft~t~, +te~|-te~|&t, te~e~, EE, E) |indentifer|digit ' class Yufa (): #语法分析程序 def __init__ (self): self.i=0 #栈指针 Try: #用异常处理程序捕获程序的错误, an error occurs when an exception self.m () excep T:print ' \ n parser checks to error ' exit (0) def m (self): #PM程序 if (table[self.i]== ' + '): Self.i+=1 wenfa.append (' M-, +e ') SELF.E () elif (table[self.i]== '-'): self.i+=1 Wenfa.append (' M-e ') SELF.E () else:wenfa.append (' M-e ') sel F.E () if (SELF.I is not len (table)-1): #语法分析结束时, if the word stack pointer and the Word table length are not equal, error print "\ n parser check error, ' (' should have operator" Exit (0) Else:print ' \ n string syntax is: ' #若一切正确, then output syntax tree grammar for i in WENFA: Print i print ' syntactically correct ' def e (self): #PE程序 wenfa.append (' e-TE1 ') self.t () SELF.E 1 () def E1 (self): #PE1程序 if (table[self.i]== ' + '): Self.i+=1 wenfa.append (' E1-+te1 ') ) self.t () Self.e1 () elif (table[self.i]== '-'): Self.i+=1 wenfa.append (' E1--te1 ') SELF.T () self.e1 () else:wenfa.append (' E1 & ') def t (self): #PT程序 Wenfa.append (' T-FT1 ') self.f () self.t1 () def t1 (self): #PT1程序 if (table[self.i]== ' * '): Self.i+=1 wenfa.append (' T1, *ft1 ') self.f () self.t1 () elif (Table[sel f.i]== '/'): Self.i+=1 wenfa.append (' T1-/ft1 ') self.f () self.t1 () Else:wenfa.append (' T1 & ') def f (self): #PF程序 if (table[self.i]== '): Wenfa . Append (' F-(E) ') self.i+=1 SELF.E () if (table[self.i]!= ') '): Raise Ex Ception self.i+=1 elif (dic[table[self.i]]==1): Wenfa.append (' F-Indentifer ' +str (table[s ELF.I]) Self. I+=1 elif (dic[table[self.i]]==2): Wenfa.append (' F-Digit ' +str (TABLE[SELF.I])) self.i+= 1 else:raise Exception #若均不符合, it leads to abnormal ####################################### Semantic Analysis # ##################################### #class yuyi:def __init__ (self): print ' \ n Semantic Analysis results (four-yuan): ' Self.i=0 #栈指针 self.flag=0 #记录临时变量T数目 self.m () for I in Siyuan: #输出四元式结果 print i def m ( Self): #PM程序 if (table[self.i]== ' + '): Self.i+=1 ret1=self.e () siyuan.append ( ' (+,0, ' +ret1+ ', Out) ') self.flag+=1 elif (table[self.i]== '-'): Self.i+=1 RET2=SELF.E () Siyuan.append (' (-,0, ' +ret2+ ', Out) ') Self.flag+=1 ELSE:RET3=SELF.E () Siyuan.append (' (=, ' +ret3+ ', 0,out) ') def e (self): #PE程序 ret1=self.t () ret2,ret3=self.e1 () I F (ret2!= ' & '): #If Ret2 is not &, it can produce a four-tuple, otherwise the variable is passed to the parent Self.flag+=1 siyuan.append (' (' +ret2+ ', ' +ret1+ ', ' +ret3+ ', T ' +str (self . Flag) + ') ' return ' T ' +str (self.flag) Else:return Ret1 def E1 (self): #PE1程序 if (table[self.i]== ' + '): Self.i+=1 ret1=self.t () ret2,ret3=self.e1 () if (ret 2== ' & '): Return ' + ', Ret1 else:self.flag+=1 siyuan.append (' (' + ret2+ ', ' +ret1+ ', ' +ret3+ ', T ' +str (self.flag) + ') ' return ' + ', ' T ' +str (self.flag) elif (table[self.i]== '- '): Self.i+=1 ret1=self.t () ret2,ret3=self.e1 () if (ret2== ' & '): Return '-', Ret1 else:self.flag+=1 siyuan.append (' (' +ret2+ ', ' +ret1+ ', ' +ret3+ ') , T ' +str (Self.flag) + ') ' return '-', ' t ' +str (self.flag) Else:return ' & ', ' & ' Def T (self): #PT程序 RET1=SELF.F () Ret2,ret3=self.t1 () if (ret2!= ' & '): Self.flag+=1 siyuan.append ( ' (' +ret2+ ', ' +ret1+ ', ' +ret3+ ', T ' +str (self.flag) + ') ') ' return ' t ' +str (self.flag) Else:return re T1 def t1 (self): #PT1程序 if (table[self.i]== ' * '): Self.i+=1 ret1=self.f () r Et2,ret3=self.t1 () if (ret2== ' & '): Return ' * ', Ret1 else:self.flag+ =1 siyuan.append (' (' +ret2+ ', ' +ret1+ ', ' +ret3+ ', T ' +str (self.flag) + ') ') ' return ' * ', ' T ' +str (self . Flag) Elif (table[self.i]== '/'): Self.i+=1 ret1=self.f () ret2,ret3=self.t1 () if (ret2== ' & '): #若ret2不为 &, you can generate a four-tuple, otherwise pass the variable to the parent return '/', Ret1 else: Self.flag+=1 siyuan.append (' (' +ret2+ ', ' +ret1+ ', ' +ret3+ ', T ' +str (self.flag) + ') ') ret Urn '/', ' T ' +str (self.flag) Else:return ' & ', ' & def f (self): #PF程序 if (table[self.i]== '): s Elf.i+=1 RET1=SELF.E () self.i+=1 return str (RET1) elif (dic[table[self.i]]==1): #当为标识符时, passed to parent TEMP=SELF.I self.i+=1 return table[temp] elif (DIC[TABLE[SELF.I]] ==2): #当为整数时, passed to parent temp=self.i self.i+=1 return table[temp]####################### ################ Main program ###################################### #if __name__== ' __main__ ': string=raw_input (' please Input expression: ') Cifa (String) Yufa () Yuyi ()
Python implements arithmetic expressions in French and French semantic analysis (application of compiling principle)