Do it yourself and write a json2xml gadget

Source: Internet
Author: User
Tags lexer
This is a creation in Article, where the information may have evolved or changed.

Project Address: Json2xml

What is ANTLR

ANTLR (another tool for Language recognition) is a powerful parser generation tool that can be used to read, process, execute, and translate structured text and binary files. At present, the tool is widely used in the field of academic and industrial production, but also the foundation of many languages, tools and frameworks.
Today we use this tool to implement a go language version of the Json2xml converter;

The role of ANTLR

The syntax description of a language is called grammar, the tool can generate a parser for the language, and automatically build the parser number AST, while the ANTLR can also automatically generate a number of the ergodic, greatly reducing the cost of manual coding parser;

Practice begins

To get to the next, take Json2xml as an example to implement a tool;

Installation

Take MacOS as an example

brew install antlr

Editing JSON language parsing syntax

Derived from Http://json.orggrammar json;json:object |   Array; object: ' {' Pair (', ' pair ') * '} ' # AnObject |   ' {'} ' # nullobject; array: ' [' Value (', ' value ') * '] ' # Arrayofvalues |   ' ['] ' # Nullarray;p air:string ': ' Value; value:string # STRING |   Number # Atom |   Object # ObjectValue |   Array # Arrayvalue |   ' True ' # Atom |   ' False ' # Atom | ' NULL ' # Atom; Lcurly: ' {'; Lbrack: ' ['; STRING: ' ' ' (ESC | ~["\ \]) * '" '; fragment ESC: ' \ \ ' (["\\/BFNRT] | Unicode); Fragment Unicode: ' U ' hex hex hex hex; fragment hex: [0-9a-fa-f]; Number: '-'? INT '. '   INT EXP?   1.35, 1.35E-9, 0.3,-4.5 | '-'?   INT EXP//1e10-3e4 | '-'? int// -3, Fragment int: ' 0 ' | ' 1 ' ... ' 9 ' 0 ' ... ' 9 ' *; No leading zerosfragment EXP: [Ee] [+\-]? INT; \-Since-means "Range" Inside [...] WS: [\t\n\r]+, Skip;

The above is a file edited in accordance with the ANTLR4 syntax format.

    • The ANTLR4 file syntax is also relatively straightforward:

      • Start with the grammar keyword, match the name to the file name
      • The parser rule must start with a lowercase letter
      • The rules for the lexical analyzer must begin with uppercase
      • | The pipe symbol splits several alternative branches of the same language rule, using parentheses to make some of the symbols a sub-rule.
    • Several proper nouns are involved:

      • Language: A language is a set of valid statements, statements consist of phrases, phrases composed of groups of words, a cyclic analogy;
      • Syntax: Grammar defines the language's semantic rules, and each rule in the grammar defines a phrase structure;
      • Parsing tree: A hierarchical structure of syntax represented in a tree form; The root node corresponds to the name of the grammar rule, and the leaf node represents the symbol or lexical symbol in the statement.
      • Lexical Analyzer: breaks the input character sequence into a series of lexical symbols. A lexical analyzer is responsible for parsing the lexical;
      • Parser: Check that the statement structure conforms to the syntax specification or is legal. The process of analysis is similar to the maze, which is usually done by contrast matching.
      • Top-down parser: is an implementation of the parser, each of which corresponds to a function in the parser;
      • Forward prediction: The parser uses forward prediction to make decisions, specifically to compare the input symbols to the starting character of each alternative branch;

Generate parsing Base Code

# antlr4 -Dlanguage=Go -package json2xml JSON.g4
using ANTLR to build the target language to go, the package is named Json2xml Base code

The resulting files are as follows:

$ tree├── JSON.g4├── JSON.interp             # 语法解析中间文件├── JSON.tokens             # 语法分析tokens流文件├── JSONLexer.interp        # 词法分析中间文件├── JSONLexer.tokens        # 词法分析tokens流文件├── json_base_listener.go   # 默认是listener模式文件├── json_lexer.go           # 词法分析器├── json_listener.go        # 抽象listener接口文件├── json_parser.go          # parser解析器文件

Implementing the parser (listener example)

Package Mainimport ("FMT" "Io/ioutil" "Log" "OS" "Strings" "Testing" "C2j/parser/json2xml" Githu B.COM/ANTLR/ANTLR4/RUNTIME/GO/ANTLR ") Func init () {log. SetFlags (log. Lstdflags | Log. lshortfile)}type j2xconvert struct {*json2xml. Basejsonlistener XML MAP[ANTLR. Tree]string}func Newj2xconvert () *j2xconvert {return &j2xconvert{&json2xml. basejsonlistener{}, make (MAP[ANTLR. tree]string),}}func (J *j2xconvert) Setxml (CTX ANTLR. Tree, s string) {J.xml[ctx] = S}func (J *j2xconvert) GetXML (CTX ANTLR. Tree) string {return j.xml[ctx]}//J2xconvert Methodsfunc (J *j2xconvert) Exitjson (CTX *json2xml. Jsoncontext) {j.setxml (CTX, J.getxml (CTX). Getchild (0)));} Func (J *j2xconvert) Stripquotes (s string) string {if s = = "" | |! Strings. Contains (S, "\" ") {return S} return S[1:len (s) -1]}func (J *j2xconvert) Exitanobject (CTX *json2xml. Anobjectcontext) {sb: = strings. builder{} sb. WriteString ("\ n") for _, P: = Range CTX. Allpair () {sb. WriteString (J.getxml (P))} j.setxml (CTX, sb.) String ())}func (J *j2xconvert) Exitnullobject (CTX *json2xml. Nullobjectcontext) {j.setxml (CTX, "")}func (J *j2xconvert) Exitarrayofvalues (CTX *json2xml. Arrayofvaluescontext) {sb: = strings. builder{} sb. WriteString ("\ n") for _, P: = Range ctx. Allvalue () {sb. WriteString ("<element>") sb. WriteString (J.getxml (p)) sb. WriteString ("</element>") sb. WriteString ("\ n")} j.setxml (CTX, sb.) String ())}func (J *j2xconvert) Exitnullarray (CTX *json2xml. Nullarraycontext) {j.setxml (CTX, "")}func (J *j2xconvert) Exitpair (CTX *json2xml. Paircontext) {tag: = J.stripquotes (CTX. STRING (). GetText ()) V: = ctx. Value () r: = Fmt. Sprintf ("<%s>%s</%s>\n", Tag, J.getxml (v), tag) j.setxml (CTX, R)}func (J *j2xconvert) Exitobjectvalue (CTX * Json2xml. Objectvaluecontext) {j.setxml (CTX, J.getxml (CTX). Object ()))}func (J *j2xconvert) Exitarrayvalue (CTX *json2xml. Arrayvaluecontext) {j.setxml (CTX, J.getxml (CTX). Array ()))}func (J *j2xconvert) Exitatom (CTX *json2xml. Atomcontext) {j.setxml (CTX, CTX). GetText ())}func (J *j2xconvert) exitstring (CTX *json2xml. Stringcontext) {j.setxml (CTX, J.stripquotes (CTX). GetText ()))}func Testjson2xmlvisitor (t *testing. T) {f, err: = OS. Open ("Testdata/json2xml/t.json") if err! = Nil {panic (err)} defer f.close () content, err: = Ioutil. ReadAll (f) if err! = Nil {panic (ERR)}//Setup the input is: = ANTLR. Newinputstream (string content)//Create lexter lexer: = Json2xml. Newjsonlexer (IS) stream: = Antlr. Newcommontokenstream (Lexer, ANTLR. Lexerdefaulttokenchannel)//Create parser and tree P: = json2xml.    Newjsonparser (stream) P.buildparsetrees = true Tree: = P.json ()//Finally AST tree j2x: = Newj2xconvert () Antlr. Parsetreewalkerdefault.walk (j2x, tree) log. Println (J2x.getxml (tree))}
The above code is relatively simple, look at the comment is good;

The general flow is as follows:

    • New input stream
    • New Lexical analyzer
    • Generates a token stream that stores lexical symbols generated by the lexical parser tokens
    • New parser parser, processing tokens
    • Then, for grammar rules, start parsing
    • Finally, the AST is traversed by the walker provided by default.

Where are the parameters and results for intermediate generation stored? OK, directly define a Map,map key to the tree storage;

xml map[antlr.Tree]string

Listener and visitor

ANTLR generated code has two defaults, the default is the listener implementation, to generate visitor, additional parameters-visitor.
The difference between these two mechanisms is that the listener's method is automatically called by the ANTLR provided by the Walker object, and the method in visitor mode must show the call visit method to access the child nodes. If you forget to call, the corresponding subtree will not be accessed.

Summarize

ANTLR is a powerful tool that allows common parsing work to be done with much less effort and with very high efficiency. At the same time, the tool separates the parsing process from the program itself, providing sufficient flexibility and maneuverability.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.