The last blog talked about the problem of constructing a syntax tree. A friend asked me in a message, why do I have to let the parser produce a syntax tree, rather than let the user decide what to do? I'm here to answer this question first.
1, most of the cases are really need to have a syntax tree
2, if you want to return directly to the results of such things, just write a visitor run the syntax tree is good, except for automatically generated code (anyway, this does not write, regardless of the cost), the amount of code basically no difference
3, adding a syntax tree can make the grammar itself more simple to describe, if you want the programmer to leave the grammar alone and then write the full semantic function to make the syntax tree, it makes most of the situation (requiring the syntax tree) particularly complex, while a few cases (without the need for a syntax tree) are not benefiting.
Although something like YACC does not contain the contents of the syntax tree and you write it yourself, isn't it hard to use it?
Now turn to the point. This article is mainly about the problem of constructing symbol tables. It is a very troublesome problem to construct the symbol table well. I have tried many methods, including strongly typed symbol tables, weakly typed symbol tables, map based symbol tables, and so on, and finally picked out the Idiasymbol in the Dia class with Visual Studio for reading PDB files (http:// msdn.microsoft.com/en-us/library/w0edf0x4.aspx) Basically the same structure: all symbols have only such a symbol class, and then everything is all. Why did you choose to do this last? Because in doing semantic analysis, the most thing to do is not to construct the symbol table, but to query the symbol table. If the symbol table is a strongly typed drawing, for example, a type to a class, a variable to a class, a function to a class and so on, always need to cast to cast to go everywhere, there is no good way to do the same thing in the case, retain a strong type and not in the code appear cast. Why does the grammar tree use visitor to solve this problem, and the symbol table is not? Because usually we are in recursive form when dealing with the syntax tree, and the symbol table is not. In a context, we actually know what the symbol object is (for example, we query the type of a variable, and the return value must be type only). This time we have to cast to use, itself is only a waste of expression. This time, the visitor mode is not and face this situation. If you try to write in visitor mode, the code that causes the semantic analysis to spread too far away makes readability almost lost. This is a dialectical question, we can enjoy the experience.
Say so a big paragraph, in fact, what is it? Let's look at the symbol table of the grammar rules itself. Since this new configurable parser also generates parser by parse a text-form grammar rule, it actually goes through so many stages as a compiler, which must have a symbolic table:
Class Parsingsymbol:public Object {public:enum symboltype {Global, enumtype, Classty PE,//Descriptor = = Base type ArrayType,//Descriptor = = Element type Tokentype, E Numitem,//Descriptor = = Parent Classfield,//Descriptor = = Field type Tokendef,//
Descriptor = = Token type ruledef,//Descriptor = = Rule type};
Public: ~parsingsymbol ();
parsingsymbolmanager* GetManager ();
Symboltype GetType ();
Const wstring& GetName ();
Vint Getsubsymbolcount ();
parsingsymbol* Getsubsymbol (Vint index);
parsingsymbol* getsubsymbolbyname (const wstring& name);
parsingsymbol* Getdescriptorsymbol ();
parsingsymbol* Getparentsymbol (); BOOL IstyPE ();
parsingsymbol* Searchclasssubsymbol (const wstring& name);
parsingsymbol* Searchcommonbaseclass (parsingsymbol* ClassType); };
This column more highlights: http://www.bianceng.cn/Programming/cplus/