LLVM platform, in just a few years, changed the direction of many programming languages, but also spawned a large number of features of the emergence of programming language, is worthy of the compiler architecture of the King, also won the 2012 ACM Software System Award-preface
Copyright NOTICE: This article for the West Wind Getaway original article, reproduced please indicate the source westerly world http://blog.csdn.net/xfxyy_sxfancy
Storage and reading of variables
The variable is the core of a programming language, and the compiler language is a kind of symbolic processing tool, in fact, there are some reasons. The stack symbol table makes it easy to record the variables and syntax symbols in the compilation process, which we have already learned in the previous section. So, is there any other way to simply implement variable access?
LLVM built-in symbol table
In fact, LLVM also provides an internal symbol table, which is not the same as our symbol table, its symbols are function-bounded, the function is a local symbol, the outside is a global symbol. The function of this symbol table is mainly designed for LLVM to find all the underlying grammatical elements, so it has limited functions.
For example, the following byte code:
define void @print(i64 %k1) {entry: ...}
We can find K1 this element through the symbol table.
This symbol table is also very simple to get, as long as you have Basicblock, you will be able to find the pointer to this symbol table:
BasicBlock* bb = context->getNowBlock(); ValueSymbolTable* st = bb->getValueSymbolTable(); Value* v = st->lookup(value);
Allocation of variable space on stack, allocainst statement
Allocainst is a standard statement of LLVM, responsible for the allocation of space on the stack, you do not have to consider the growth of the stack operation, it will automatically help you to complete and return to your corresponding space pointer.
Never assume that this statement can allocate heap memory dynamically, and heap memory is actually allocated by invoking malloc statements.
%k = alloca i64
The above statement will change the type of k into a pointer to the type you are assigning.
The C + + interface of this statement is very useful, like this:
newcontext->getNowBlock());
T corresponds to the type of assignment, var_name the variable name returned by the corresponding statement (' K ' above), and the last parameter is, of course, the inserted basicblock.
At this point, the returned statement represents the pointer of K.
Storage of variables
In LLVM, the storage of variables needs to know the pointer to store the address, note that it must be a pointer, not a value.
Prototype:
*Val*Ptr*InsertAtEnd)
Examples of Use:
newfalse, context->getNowBlock());
This value1 is the target's storage pointer, and value2 is the value to put in. False means that it is not variable, and this parameter is equivalent to the volatile keyword in C, which is primarily to prevent the compiler from being optimized for repeated reads. Because of the general compiler optimizations, a variable is read multiple times without change, which is considered to fetch the same value, although this is not true in the context of multithreading and hard interrupts.
Reading of variables
Read the variable, use the LOAD statement:
constboolunsigned Align, BasicBlock *InsertAtEnd)
Examples of Use:
new""false, bb);
We do not consider the problem of memory alignment for the time being, of course, generally in clang, are 4-byte aligned. We notice that the LOAD statement is also a value from the pointer, and the return is a value type.
Create an assignment statement
An assignment statement is actually a very awkward statement, the left to assign value, should be a pointer address, and the right part, it should be a obtained value. The majority of our operations, function calls, and so on, are dependent on value types.
We first need to implement a value for the variable to obtain, this part because very general, we put in the Idnode node code generation:
Value*Idnode:: CodeGen(Codegencontext*Context) {Basicblock*Bb=Context -Getnowblock (); Valuesymboltable*St=Bb -Getvaluesymboltable (); Value*V=St -lookup (value);if(V== NULL ||V -Hasname ()== false) {errs ()<< "Undeclared variable" <<Value<< "\ n";return NULL; } Value*Load= NewLoadinst (V,"",false, BB);returnLoad;}
Value is a member variable of our class, and the name of the variable is recorded.
However, assignment statements sometimes require pointers to be obtained, not values, and now we are going to implement a symbolic pointer acquisition for an assignment statement:
Value*Idnode:: CodeGen(Codegencontext*Context) {Basicblock*Bb=Context -Getnowblock (); Valuesymboltable*St=Bb -Getvaluesymboltable (); Value*V=St -lookup (value);if(V== NULL ||V -Hasname ()== false) {errs ()<< "Undeclared variable" <<Value<< "\ n";return NULL; }if(Context -Issave ())returnV//We record a variable in the context class to see whether the current state is stored or takenValue*Load= NewLoadinst (V,"",false, BB);returnLoad;}
So when we call, we just need to do this:
Static Value*Opt2_macro (Codegencontext*Context, Node*node) {STD:: StringOpt=Node -Getstr (); Node*Op1=(node=Node -GetNext ());if(node== NULL)return NULL; Node*Op2=(node=Node -GetNext ());if(node== NULL)return NULL;if(opt== "=") {Context -Setissave (true);//This two-sentence setting is currently parsed for the following node when the pointer is returned instead of the value after the loadValue*Ans1=Op1 -CodeGen (context); Context -Setissave (false); Value*Ans2=Op2 -CodeGen (context);return NewStoreinst (Ans2, ANS1,false, context -Getnowblock ()); }...}
In fact, we can also implement a function to handle this function alone, but because two functions are so similar, we don't want to add a function like this one.
This part of the time to deal with this for the time being, after the overall structure is perfect, there should be a better way to achieve.
Compiler architecture of the King llvm--(10) variable storage and reading