From getting started to getting out of JavaCC, getting started with JavaCC

Source: Internet
Author: User

From getting started to getting out of JavaCC, getting started with JavaCC
I. JavaCC

JavaCC is a java compiler. JavaCC is the LL parser generator. It can process a narrow range of syntaxes, but supports unlimited long token advanced scanning.

Installation Process:

I downloaded the zip package from github, installed ant, and installed javacc through ant.

1. First download the ant source code, then tar-zvxf apache-ant....tag.gz decompress, then you can see ant executable files in the decompressed bin directory

2. Download javacc from github and go to the decompressed directory to execute xxxxxx/ant. The javacc. jar package is displayed in the target directory.

3. At this time, you can use the following method to make the jar package into an executable file:

First, create a shell script:

#!/bin/shMYSELF=`which "$0" 2>/dev/null`[ $? -gt 0 -a -f "$0" ] && MYSELF="./$0"java=javaif test -n "$JAVA_HOME"; then    java="$JAVA_HOME/bin/java"fiexec "$java" $java_args -cp $MYSELF "$@"exit 1

Name it stub. sh and run cat stub. sh javacc. jar> javacc & chmod + x javacc in the directory where the jar package is located. This is an executable file, but a javacc parameter must be included when parsing the. jj file, like this: javacc Adder. jj

Ii. syntax description file 1. Introduction

The syntax description file of JavaCC is a file with the extension. jj. Generally, the content of the syntax description file is in the following format:

Options {JavaCC options} PARSER_BEGIN (parser class name) package name; import library name; public class parser class name {any Java code} PARSER_END (parser class name) scanner description parser description

Like java, JavaCC defines the parser content in a single class, so it describes the content of this class between PARSER_BEGIN and PARSER_END.

2. Example

The following code is a syntax description file of the parser that parses positive integer addition operations and performs calculations.

options {    STATIC = false;}PARSER_BEGIN(Adder)import*class Adder {    public static void main(String[] args) {        for (String arg : args) {            try {                System.out.println(evaluate(arg));            } catch (ParseException ex) {                System.err.println(ex.getMessage());            }        }    }        public static long evaluate(String src) throws ParseException {        Reader reader = new StringReader(src);        return new Adder(reader).expr();    }}PARSER_END(Adder)SKIP: { <[" ", "\t", "\r", "\n"]> }TOKEN: {    <INTEGER: (["0"-"9"])+>}long expr():{    Token x, y}{    x=<INTEGER> "+" y=<INTEGER> <EOF>    {        return Long.parseLong(x.image) + Long.parseLong(y.image);    }}

Set the STATIC option to false in the options block. If this option is set to true, all the Members and methods generated by JavaCC will be defined as static, if STATIC is set to true, the generated parser cannot be used in a multi-threaded environment. Therefore, this option is always set to false. (The default value of STATIC is true)
From PARSER_BEING (Adder) to PARSER_END (Adder) is the definition of the parser class. The members and methods to be defined in the parser class are also written here. In order to realize that only the Adder class can run, the main function is defined here.
The following SKIP and TOKEN sections Define the scanner. SKIP indicates that spaces, tabs, and line breaks are to be skipped. TOKEN indicates scanning integer characters and generating token.
Long expr... the last part defines a narrow parser. This part parses the token sequence and performs some operations.

3. Run JavaCC.

To use JavaCC to process Adder. jj (demo1.jj in the figure), use the following javacc command:

Run the above command to generate Adder. java and Other helper classes.
To compile the generated Adder. java, you only need the javac command:


In this way, the Adder. class file is generated. The Adder class obtains computation formula from the command line parameters and performs computation. Therefore, you can input computation formula from the command line and execute


3. Start the parser generated by JavaCC

Parse the code of the main function. The main function uses the strings of all command line parameters as the calculation formula of the computing object, and uses the evaluate Method for Calculation in turn.
The evaluate method generates an object instance of the Adder class. And let the Adder object calculate (PARSE) the parameter string src.
To run the parser class generated by JavaCC, follow these two steps:

  1. Generate an object instance of the parser class
  2. Call a method with the same name as the statement to be parsed using the generated object

1st: four types of constructor are defined by default in the parser generated by JavaCC4.0.

  1. Parser (InputStream s)
  2. Parser (InputStream s, String encoding)
  3. Parser (Reader r)
  4. Parser (x TokenManager tm)

1st constructor types are parsed by passing in the InputStream object. This constructor cannot set the encoding of the input string, so it cannot process Chinese characters.
In addition to the InputStream object, the local constructors can also set the input string encoding to generate the parser. However, if you want to parse Chinese strings or comments, you must use 2nd or 3 constructors.
The second constructor is used to parse the content read by the Reader object.
The third type is to pass in the scanner as a parameter.
After the parser is generated, use this instance to call a method with the same name as the syntax to be parsed. Here, the expr method of the Adder object is called, and the parsing starts after the response. After the parsing ends normally, the semantic value is returned.

Iv. Chinese Processing

To enable JavaCC to process Chinese characters, you must first set the UNICODE_INPUT option that is quick to options in the syntax description file to true:

options {    STATUS = false;    DEBUG_PARSER = true;    UNICODE_PARSER = true;    JDK_VERSION = "1.5";}

In this way, the input characters are converted to UNICODE before processing. When the UNICODE_INPUT option is false, only characters in the ASCII range can be processed.
You also need to use the 2/3 constructor to set the appropriate encoding for the input string.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.