Compiler Development Series -- Ocelot language 1. Abstract syntax tree, -- ocelot syntax
Starting from today, I began to study and develop my own programming language Ocelot. Starting from the "self-made compiler", I continued to improve and optimize my functions.
The front-end of the compiler is simple and will not be studied in depth. a ready-made tool called JavaCC can be used to generate an abstract syntax tree. The abstract syntax tree is the key to generating intermediate code, intermediate code is the key to generating back-end code.
The entire compiler code is written in java. The main function is to analyze and optimize the semantics of the abstract syntax tree generated by JavaCC, and finally generate the optimized assembly code, then, use the assembler to generate a machine code for the assembly code, and then use the command link to generate a Linux executable file, which can be directly run on Linux.
The syntax used by the entire compiler is basically a C language syntax, removing some syntax into a simplified C language version, and the original project is not optimized. I want to optimize the original project and support garbage collection. --! Some of them are playing.
The abstract syntax tree and its nodes are inherited from the Node class. This section describes the hierarchy of Node groups:
A simple helloworld demo is used to view the structure of the abstract syntax tree. The demo is as follows:
int main(int argc, char **argv){int i, j = 5;if (i) {return (j * 1 - j);}else {exit(1);}}
The abstract syntax tree generated after the compiler project is run is as follows:
<AST> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 1) variables: functions: <DefinedFunction> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 1) name: "main" isPrivate: false params: parameters: <Parameter> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 1) name: "argc" typeNode: int <Parameter> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 1) name: "argv" typeNode: char ** body: <BlockNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 2) variables: <DefinedVariable> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 3) name: "I" isPrivate: false typeNode: int initializer: null <DefinedVariable> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 3) name: "j" isPrivate: false typeNode: int initializer: <IntegerLiteralNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 3) typeNode: int value: 5 sort ts: <IfNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 4) cond: <VariableNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 4) name: "I" thenBody: <BlockNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 4) variables: cmdts: <ReturnNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 5) expr: <BinaryOpNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 5) operator: "-" left: <BinaryOpNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 5) operator: "*" left: <VariableNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 5) name: "j" right: <IntegerLiteralNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 5) typeNode: int value: 1 right: <VariableNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 5) name: "j" elseBody: <BlockNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 7) variables: stmts: <ExprStmtNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 8) expr: <FuncallNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 8) expr: <VariableNode> (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 8) name: "exit" args: <IntegerLiteralNode> (G: \ Compiler Principle \ homemade compiler \ source code \ test \ hello. cb: 8) typeNode: int value: 1
1. <AST> and <DefinedFunction> indicate the class names of nodes.
2. The (G: \ Compilation Principle \ homemade compiler \ source code \ test \ hello. cb: 1) displayed on the right is the file name and row number recorded in the corresponding syntax of the node.
3. indent indicates that the node is referenced by the previous node.