As the previous blog post said, I decided to develop a better configurable lightweight parser to replace the previous backward version, (mainly or because of the gacui). Before I say this article, I would like to recommend a "programming language implementation Model" here, which is really a good book, let me encounter.
In fact, when it comes to developing a parser, I've been thinking about similar issues since 2007. C + + is still in use when not too skilled, it is inevitable to do some silly things, but in general the idea is still able to use. Since then, I have been implementing various languages to exercise myself. So it is inevitable to develop a configurable parser for yourself. So there are:
First edition: http://hi.baidu.com/geniusvczh/archive/tag/syngram%E6%97%A5%E5%BF%97
Second Edition: Http://www.cppblog.com/vczh/archive/2009/04/06/79122.html
Third edition: Http://www.cppblog.com/vczh/archive/2009/12/13/103101.html
There is also a third edition of the Tutorial: http://www.cppblog.com/vczh/archive/2010/04/28/113836.html
All of the above parsers are dedicated to allowing the system to quickly construct a handy parser for a particular purpose in C + + by directly describing the grammar and some semantic behavior, and the "third edition" is a version that has been used so far. As for why I have to do a new--that is, the fourth edition--before the article has been said.
Today, the fourth edition of the development has been started for several days. If you care about the progress, you can go to the Gacui CodePlex page to download the code, and then read common\source\parsing the source file below. The corresponding unit test can be found in the common\unittest\unittest\testparsing.cpp.
So today I'm going to talk about the construction of the grammar tree.
People who have written parser in C + + know that the construction of syntax trees and semantic analysis of the symbol table is extremely cumbersome, and inadvertently easy to write Xiang things. But I have written an infinite number of grammar trees and constructed infinitely multiple symbol tables and side effects, Xiang, ah no, experience, there are some ways to do this thing.
Before introducing this method, first of all, to say that human flesh to finish all the following things are sure to go crazy, so this time the configurable parser I have decided to the TMD to write a syntax tree to generate C + + code tools.
A grammar tree, in fact, is a large pile of mutually inherited classes. The common characteristic of all mature grammatical tree structures is not how his members are arranged, but the mechanism that he will attach to a visitor pattern. As for what is the visitor mode, please refer to the design mode, I will not say more nonsense. This time the configurable parser comes with a descriptive syntax. In other words, like ANTLR or YACC, the syntax tree structure and grammar rules are first prepared in a text file, and then my tool will help you generate an in-memory parser and the Declaration and implementation file of the syntax tree described in C + +. This descriptive syntax is similar to the following arithmetic expression structure that you are familiar with that you cannot be familiar with:
Class Expression {} class Numberexpression:expression {token value;}
Class Binaryexpression:expression {enum Binaryoperator {Add, Sub, Mul, Div,
} Expression Firstoperand;
Expression Secondoperand;
Binaryoperator Binaryoperator;
Class Functionexpression:expression {token functionname;
expression[] arguments;
} token NAME = "[a-za-z_]/w*";
Token number = "/d+ (./d+)";
Token ADD = "/+";
Token SUB = "-";
Token MUL = "/*";
Token DIV = "//";
Token left = "/(";
Token right = "/)";
Token COMMA = ",";
Rule numberexpression number = Number:value;
Rule functionexpression called = Name:functionname "([exp:arguments {, exp:arguments}]"); Rule Expression Factor =! Number |!
Call; Rule Expression Term =!
Factor;
= Term:firstoperand "*" Factory:secondoperand as binaryexpression with {binaryoperator = "Mul"}; = Term:firstoperand "/" Factory:secondoperand as binaryexpression with {binaryoperator = "Div"}; Rule Expression EXP =!
Term;
= Exp:firstoperand "+" Term:secondoperand as binaryexpression with {binaryoperator = "Add"}; = Exp:firstoperand "-" Term:secondoperand as binaryexpression with {binaryoperator = "Sub"};
This column more highlights: http://www.bianceng.cn/Programming/cplus/