We used ANTLR to describe the basic syntax of Jerry's language and to experiment with the parsing tree that the syntax generates for sample code by Antlrworks. However, as explained in the last article, there are too many redundant information in the resulting parse tree that is useless for subsequent processing. We need to eliminate these redundant information and get the abstract Syntax tree (AST).
Based on the previous syntax, this article simplifies the ANTLR default-generated parse tree to an abstract syntax tree by adding tree rewrite rules.
This article refers to the source code and Run-time Library packaged in the attachment, do not bother to copy and paste the words directly download the version of the attachment, with Antlrworks to view and edit the grammar file it ~
The modified syntax file is as follows:
JERRY.G (ANTLR 3.1 syntax file, Java to generate target language)
Java code
1.grammar Jerry;
2.
3.options {
4. Language = Java;
5. Output = AST;
6. Astlabeltype = Commontree;
7.}
8.
9.tokens {
Ten//Imaginary Tokens
VAR_DECL;
Simple_type;
Array_type;
Array_literal;
simple_var_access;
array_var_access;
Unary_minus;
Block;
expr_stmt;
20.}
21st.
22.//Parser Rules
23.
24.program:statementlist eof!
25. {
System.out.println (
Null = = $statementList. Tree?
"NULL":
$statementList. Tree.tostringtree ());
30.}
31.;
32.
33.statementList
--statement*
35.;
36.
37.statement
Expressionstatement:
39. | Variabledeclaration
40. | Blockstatement
41. | Ifstatement
42. | Whilestatement
43. | Breakstatement
44. | Readstatement
45. | Writestatement
46.;
47.
48.expressionStatement
: Expression semicolon
-> ^ (expr_stmt expression)
51.;
52.
53.variableDeclaration
: Typespecifier
(Identifier
(-> ^ (var_decl ^ (simple_type typespecifier) Identifier)
57. | (Lbrack Integer Rbrack) +
> ^ (var_decl ^ (array_type typespecifier integer+) Identifier)
59. | EQ expression
-> ^ (var_decl ^ (simple_type typespecifier) Identifier expression)
61. | (Lbrack Integer rbrack) + EQ arrayliteral
-> ^ (var_decl ^ (array_type typespecifier integer+) Identifier arrayliteral)
63.)
64.)
(COMMA Id=identifier
(-> $variableDeclaration ^ (var_decl ^ (simple_type typespecifier) $id)
67. | (Lbrack Dim1+=integer Rbrack) +
-> $variableDeclaration ^ (var_decl ^ (array_type typespecifier $dim 1+) $id)
69. | EQ exp=expression
-> $variableDeclaration ^ (var_decl ^ (simple_type typespecifier) $id $exp)
71. | (Lbrack dim2+=integer rbrack) + EQ al=arrayliteral
-> $variableDeclaration ^ (var_decl ^ (array_type typespecifier $dim) $id $al)
73.)
A. {if (null!= $dim 1) $dim 1.clear (); if (null!= $dim 2) $dim 2.clear ();}
75.) *
Semicolon.
77.;
78.
79.typeSpecifier
.: INT | Real
81.;
82.
83.arrayLiteral
: Lbrace
Arrayliteralelement (COMMA arrayliteralelement) *
Rbrace.
-> ^ (array_literal arrayliteralelement+)
88.;
89.
90.arrayLiteralElement
A.: expression
92. | Arrayliteral
93.;
94.
95.blockStatement
: Lbrace statementlist Rbrace
-> ^ (Block statementlist)
98.;
99.
100.ifStatement
A.: if^ lparen! Expression rparen! Statement (else! statement)?
102.;
103.
104.whileStatement
lparen!: while^. Expression rparen! Statement
106.;
107.
108.breakStatement
109.: Break semicolon!
110.;
111.
112.readStatement
113.: read^ variableaccess semicolon!
114.;
115.
116.writeStatement
117.: write^ expression semicolon!
118.;
119.
120.variableAccess
121.: Identifier
122. (-> ^ (simple_var_access Identifier)
123. | (Lbrack Integer Rbrack) +
124.-> ^ (array_var_access Identifier integer+)
125.)
126.;
127.
128.expression
129.: assignmentexpression
130. | Logicalorexpression
131.;
132.
133.assignmentExpression
134.: variableaccess eq^ expression
135.;
136.
137.logicalOrExpression
138.: logicalandexpression (oror^ logicalandexpression) *
139.;
140.
141.logicalAndExpression
: Relationalexpression (andand^ relationalexpression) *
143.;
144.
145.relationalExpression
146.: Additiveexpression (relationaloperator^ additiveexpression)?
147. | bang^ relationalexpression
148.;
149.
150.additiveExpression
151.: Multiplicativeexpression (additiveoperator^ multiplicativeexpression) *
152.;
153.
154.multiplicativeExpression
: Primaryexpression (multiplicativeoperator^ primaryexpression) *
156.;
157.
158.primaryExpression
159.: variableaccess
160. | Integer
161. | RealNumber
162. | lparen! Expression rparen!
163. | Minus Primaryexpression
164.-> ^ (Unary_minus primaryexpression)
165.;
166.
167.relationalOperator
The.: LT | GT | Eqeq | LE | GE | NE
169.;
170.
171.additiveOperator
172.: PLUS | Minus
173.;
174.
175.multiplicativeOperator
176.: MUL | Div
177.;
178.
179.//lexer Rules
180.
181.LPAREN: ' ('
182.;
183.
184.RPAREN: ') '
185.;
186.
187.LBRACK: ' ['
188.;
189.
190.RBRACK: '] '
191.;
192.
193.LBRACE: ' {'
194.;
195.
196.RBRACE: '} '
197.;
198.
199.COMMA: ', '
200.;
201.
202.SEMICOLON
203.: '; '
204.;
205.
206.PLUS: ' + '
207.;
208.
209.MINUS: '-'
210.;
211.
212.MUL: ' * '
213.;
214.
215.DIV: '/'
216.;
217.
218.EQEQ: ' = = '
219.;
220.
221.NE: '!= '
222.;
223.
224.LT: ' < '
225.;
226.
227.LE: ' <= '
228.;
229.
230.GT: ' > '
231.;
232.
233.GE: ' >= '
234.;
235.
236.BANG: '! '
237.;
238.
239.ANDAND: ' && '
240.;
241.
242.OROR: ' | | '
243.;
244.
245.EQ: ' = '
246.;
247.
248.IF: ' IF '
249.;
250.
251.ELSE: ' ELSE '
252.;
253.
254.WHILE: ' While '
255.;
256.
257.BREAK: ' Break '
258.;
259.
260.READ: ' READ '
261.;
262.
263.WRITE: ' WRITE '
264.;
265.
266.INT: ' INT '
267.;
268.
269.REAL: ' Real '
270.;
271.
272.Identifier
273.: Letterorunderscore (Letterorunderscore | Digit) *
274.;
275.
276.integer:digit+
277.;
278.
279.RealNumber
280.: digit+ '. ' digit+
281.;
282.
283.fragment
284.Digit: ' 0 ' ... ' 9 '
285.;
286.
287.fragment
288.LetterOrUnderscore
289.: letter | '_'
290.;
291.
292.fragment
293.Letter: (' a ' ... ' Z ' | ' A ' ... ' Z ')
294.;
295.
296.WS: (' | ' t ' | ' \ r ' | ' \ n ') + {$channel = HIDDEN;}
297.;
298.
299.Comment
: '/* ' (Options {greedy = false;}:.) * ' */' {$channel = HIDDEN;}
301.;
302.
303.LineComment
304.: '//' (' \ n ' | ' \ r ') * ' \ r '? ' \ n ' {$channel = HIDDEN;}
305.;