The definitive ANTLR 4 Reference 2nd Edition 4th Chapter The first section of the study notes
Match Arithmetic expression language
In this example, only basic arithmetic operations (add, subtract, multiply, divide), bracket expressions, integers, and variables are used. For example, an expression like the below.
193a = 5b = 6a+b*2 (1+2)
The expression language described here is composed of a set of statements separated by a newline character. The statement can be an expression, an assignment operation, or a blank line. The following is the ANTLR grammar used to parse the above statements and expressions.
Grammar expr;prog:stat+;stat:expr NEWLINE | ID ' = ' Expr NEWLINE | NEWLINE ; expr:expr (' * ' | '/') expr| Expr (' + ' | '-') expr| Int| id| ' (' expr ') '; ID: [A-za-z]+;int: [0-9]+; NEWLINE: ' \ r '? ' \ n ';
First, some key concepts in the concept of ANTLR grammar are described.
- A grammar consists of a set of rules that describe the grammar. These include lexical and grammatical rules.
- Grammar rules are made up of lowercase letters. such as Prog,stat.
- The word rules are made up of uppercase letters. such as id:[a-za-z]+.
- By using | Operators to split different rules, and you can use parentheses to form sub-rules. For example (' * ' | '/') will match multiple multiplication sign or division sign.
In the grammar above, –> skip is an indicator that tells the lexical parser to match and discards these whitespace characters.
In addition, an important feature of ANTLR V4 is the ability to handle most of the left recursion rules.
Grammar file after writing the right mouse button grammar file-Generate ANTLR ... The Xxparser and xxlexer files are then generated under the specified directory. With these files you can complete the analysis task.
public class Main {public static void Main (string[] args) {String expr = "(1+2*3/2-7))"; Antlrinputstream input = new Antlrinputstream (expr); Exprlexer lexer = new Exprlexer (input); Commontokenstream tokens = new Commontokenstream (lexer); Exprparser parser = new Exprparser (tokens); Parsetree tree = Parser.prog (); System.out.println (Tree.tostringtree (parser));}}
The code creates a character input stream object, which is used as input for creating lexical analysis objects. The lexical (exprlexer) parsing object, the symbolic stream object (Commontokenstream), and the syntax (Exprparser) parsing objects are then created. Commontokenstream the Exprparser with the exprlexer in tandem. The code finally prints out the parse tree.
Import grammar
If you write everything in one file, it is not easy to manage, so you can decompose the grammar file. One way is to separate grammar from lexical definitions. The lexical file is defined by Lexer grammar , and the lexical definition is imported with the import keyword in the grammar file.
Defines a grammar file in which the lexical definition is moved into the commonlexerrules.g4 file and imported into the file via import.
EXPR.G4 file
Grammar expr;import commonlexerrules;prog:stat+;stat:expr NEWLINE | ID ' = ' Expr NEWLINE | NEWLINE ; expr:expr (' * ' | '/') expr| Expr (' + ' | '-') expr| Int| id| ' (' expr ') ';
Commonlexerrules.g4
Lexer Grammar Commonlexerrules;id: [A-za-z]+;int: [0-9]+; NEWLINE: ' \ r '? ' \ n '; WS: [\t]+, Skip;
Match Arithmetic expression language