In Javaeye blog channel Stroll, see Neuronr Blog on his compiler practice posts, feel a bit of meaning, so parallel to use other methods to do that compiler. The request is to use C to achieve, I will use some more convenient language to achieve it.
This article will describe Jerry language through ANTLR 3.1 and experiment in Antlrworks to get the parse tree for Jerry's program code through the generated parser.
Looking at the parser builder, ANTLR should not be a strange name. Anyway a short word. ANTLR is getting more and more applications in generating parsers, several instances, Xruby projects, Jython are now in use, and Sapphiresteel Ruby and ActionScript Ides also use ANTLR to generate the compiler front-end.
Antlr,another Tool for Language recognition is a language tool developed by Professor Terence Parr to assist in the generation of parsers, compilers, interpreters, and other language-related programs. It is developed by pccts, which is characterized by a recursive descent parser that uses the LL (*) algorithm. It uses a syntax presentation format similar to the Common EBNF (extended backus–naur form) that is easy to understand and maintain.
After reading the compiler construction practice [task Layout], I probably thought that the grammar should be like this:
JERRY.G: (ANTLR 3.1 syntax)
Java code
1.grammar Jerry;
2.
3.program:statementlist EOF
4.;
5.
6.statementList
7.: statement*
8.;
9.
10.statement
One.: expressionstatement
12. | Variabledeclaration
13. | Blockstatement
14. | Ifstatement
15. | Whilestatement
16. | Breakstatement
17. | Readstatement
18. | Writestatement
19.;
20.
21.expressionStatement
: Expression semicolon
23.;
24.
25.variableDeclaration
Num.: Typespecifier Identifier (lbrack Integer rbrack) * initializer?
(COMMA Identifier (lbrack Integer rbrack) * initializer?) *
Semicolon
29.;
30.
31.typeSpecifier
: INT | Real
33.;
34.
35.initializer
: EQ (Expression | arrayliteral)
37.;
38.
39.arrayLiteral
: Lbrace
(Expression | arrayliteral) (COMMA (expression | arrayliteral)) *
Rbrace.
43.;
44.
45.blockStatement
: Lbrace statementlist Rbrace
47.;
48.
49.ifStatement
A.: IF lparen expression rparen statement (ELSE statement)?
51.;
52.
53.whileStatement
: While Lparen expression Rparen statement
55.;
56.
57.breakStatement
A.: Break semicolon
59.;
60.
61.readStatement
: READ variableaccess Semicolon
63.;
64.
65.writeStatement
: WRITE expression Semicolon
67.;
68.
69.variableAccess
: Identifier (Lbrack Integer rbrack) *
71.;
72.
73.expression
Assignmentexpression.:
75. | Logicalorexpression
76.;
77.
78.assignmentExpression
Expression: Variableaccess EQ.
80.;
81.
82.logicalOrExpression
Oror: Logicalandexpression (logicalandexpression) *
84.;
85.
86.logicalAndExpression
Andand: Relationalexpression (relationalexpression) *
88.;
89.
90.relationalExpression
Relationaloperator: Additiveexpression (additiveexpression)?
92. | BANG relationalexpression
93.;
94.
95.additiveExpression
: Multiplicativeexpression (additiveoperator multiplicativeexpression) *
97.;
98.
99.multiplicativeExpression
Primaryexpression: (Multiplicativeoperator primaryexpression) *
101.;
102.
103.primaryExpression
Variableaccess:
105. | Integer
106. | RealNumber
107. | Lparen expression Rparen
108. | SUB primaryexpression
109.;
110.
111.relationalOperator
112.: LT | GT | Eqeq | LE | GE | NE
113.;
114.
115.additiveOperator
116.: ADD | SUB
117.;
118.
119.multiplicativeOperator
MUL.: | Div
121.;
122.
123.//lexer Rules
124.
125.LPAREN: ' ('
126.;
127.
128.RPAREN: ') '
129.;
130.
131.LBRACK: ' ['
132.;
133.
134.RBRACK: '] '
135.;
136.
137.LBRACE: ' {'
138.;
139.
140.RBRACE: '} '
141.;
142.
143.COMMA: ', '
144.;
145.
146.SEMICOLON
147.: '; '
148.;
149.
150.ADD: ' + '
151.;
152.
153.SUB: '-'
154.;
155.
156.MUL: ' * '
157.;
158.
159.DIV: '/'
160.;
161.
162.EQEQ: ' = = '
163.;
164.
165.NE: '!= '
166.;
167.
168.LT: ' < '
169.;
170.
171.LE: ' <= '
172.;
173.
174.GT: ' > '
175.;
176.
177.GE: ' >= '
178.;
179.
180.BANG: '! '
181.;
182.
183.ANDAND: ' && '
184.;
185.
186.OROR: ' | | '
187.;
188.
189.EQ: ' = '
190.;
191.
192.IF: ' IF '
193.;
194.
195.ELSE: ' ELSE '
196.;
197.
198.WHILE: ' While '
199.;
200.
201.BREAK: ' Break '
202.;
203.
204.READ: ' READ '
205.;
206.
207.WRITE: ' WRITE '
208.;
209.
210.INT: ' INT '
211.;
212.
213.REAL: ' Real '
214.;
215.
216.Identifier
217.: Letterorunderscore (Letterorunderscore | Digit) *
218.;
219.
220.integer:digit+
221.;
222.
223.RealNumber
224.: digit+ '. ' digit+
225.;
226.
227.fragment
228.Digit: ' 0 ' ... ' 9 '
229.;
230.
231.fragment
232.LetterOrUnderscore
233.: Letter | '_'
234.;
235.
236.fragment
237.Letter: (' a ' ... ' Z ' | ' A ' ... ' Z ')
238.;
239.
240.WS: (' | ' t ' | ' \ r ' | ' \ n ') + {$channel = HIDDEN;}
241.;
242.
243.Comment
244.: '/* ' (Options {greedy = false;}:.) * ' */' {$channel = HIDDEN;}
245.;
246.
247.LineComment
248.: '//' (' \ n ' | ' \ r ') * ' \ r '? ' \ n ' {$channel = HIDDEN;}
249.;