First, JavaCC
JAVACC is Java's compiler compiler. JAVACC is the LL parser generator, which handles a narrower range of syntax, but supports an infinitely long token advance scan.
installation process:
I was down from GitHub with a zip archive and then installed the next ant and then the Ant-Installed JAVACC
1. First download the ant source, then tar-zvxf apache-ant....tag.gz extract, then you can see in the extracted bin directory of the ant executable file
2. Download JAVACC from GitHub and go to the uncompressed directory to execute xxxxxx/ant. The Javacc.jar package is then seen in the target directory
3. At this time, the jar package can be made into an executable file by the following method:
First create a shell script:
#!/bin/SHmyself=`which " $" 2>/dev/NULL`[ $? -gt0-a-f" $"] && myself="./$0"Java=JavaifTest-n"$JAVA _home"; ThenJava="$JAVA _home/bin/java"fiexec"$java"$java _args-CP$MYSELF"[email protected]"Exit1
Named Stub.sh, then executes in the directory where the jar package is located: Cat stub. SH Javacc.jar > Javacc && chmod +x JAVACC. Such an executable file is available, but you need to take a JAVACC parameter when parsing the. jj file, like this: Javacc javacc adder.jj
Ii. Grammar Description file 1, Introduction
The syntax description file for JAVACC is a file with the extension. JJ, in general, the contents of the grammar description file are in the following form
options { JavaCC的选项}PARSER_BEGIN(解析器类名)package 包名;import 库名;public class 解析器类名 { 任意的Java代码}PARSER_END(解析器类名)扫描器的描述解析器的描述
JAVACC, like Java, defines the contents of the parser in a single class, thus describing the contents of this class between Parser_begin and Parser_end.
2, Example
The following code is a syntax descriptor for a parser that parses a positive integer addition and evaluates it.
Options {STATIC =false;} Parser_begin (Adder)Import java.io.*Class Adder {PublicStaticvoid Main (string[] args) {for (String Arg:args) {try {System.out.println (Evaluate (ARG)); }catch (ParseException ex) {System.err.println (Ex.getmessage ()); } } }public static long Evaluate ( String src) throws parseexception {Reader reader = new StringReader (SRC); return new Adder (reader). expr ();}} Parser_end (Adder) SKIP: {<[ "\ T", "\ n"]>} TOKEN: {<integer: ([ "0"- "9"]) +>} long expr (): {Token x, y}{x=<integer> "+" Y=<integer > <EOF> {return long.parselong (x.image) + long.parselong (y.image); }}
The options block sets the static option to false, and if the option is set to true, all members and methods generated by JAVACC will be defined as static, and if static is set to true the resulting parser cannot be used in a multithreaded environment. Therefore, this option is always set to false. (The default value for static is true)
From Parser_being (Adder) to Parser_end (Adder) is the definition of the parser class. The members and methods that need to be defined in the parser class are also written here. To achieve this, the main function is defined, even if only the Adder class can run.
The Skip and Token sections then define the scanner. Skip indicates that you want to skip spaces, tabs (tab), and line breaks. Token indicates that the integer character is scanned and tokens are generated.
Long expr ... The beginning to the last part defines a narrow parser. This section resolves the token sequence and performs some operations.
3. Running JAVACC
To handle ADDER.JJ with JAVACC (DEMO1.JJ in the figure), you need to use the following JAVACC command
Running the above command generates Adder.java and other auxiliary classes.
To compile the generated Adder.java, you only need the Javac command:
This generates the Adder.class file. The Adder class gets the formula from the command-line argument and evaluates it, so you can enter the formula from the command line as follows and execute
Third, start JAVACC generated parser
Now parse the code of the main function. The main function calculates the string of all command-line arguments as a calculation object, followed by the Evaluate method.
An object instance of the Adder class is generated in the Evaluate method. And let the Adder object to compute (parse) the parameter string src.
The following 2 steps are required to run the parser class generated by JAVACC:
- To generate an object instance of the parser class
- Methods to invoke the generated object with the same name as the statement that needs to be resolved
1th: The following four types of constructors are defined by default in the parser generated by JavaCC4.0.
- Parser (InputStream s)
- Parser (InputStream s, String encoding)
- Parser (Reader R)
- Parser (x x x x Tokenmanager tm)
The 1th type of constructor is constructed by passing in the InputStream object. This constructor cannot set the encoding of the input string, so it cannot handle Chinese characters and so on.
In addition to the InputStream object, the 2 constructors can also set the encoding of the input string to generate the parser. However, if you want to parse a Chinese string or comment, you must use the 2nd/3 constructors.
The 3rd type of constructor is used to parse what the reader object reads.
The 4th type is to pass the scanner as a parameter.
After the parser is generated, use this instance to invoke a method with the same name as the syntax that needs to be parsed. This invokes the expr method of the Adder object, which returns to the beginning of parsing, and the semantic value is returned after the normal end of the parsing.
Iv. Processing of Chinese
To enable JAVACC to handle Chinese first you need to set the options for the grammar description file Unicode_input option to true:
options { STATUS = false; DEBUG_PARSER = true; UNICODE_PARSER = true; JDK_VERSION = "1.5";}
This will first convert the input characters to Unicode before processing. The Unicode_input option is false when only the ASCII range of characters is processed.
It is also necessary to use the 2nd/3 construction method to set the appropriate encoding for the input string.
Javacc from getting started to going out