Code generation for compiling principles

Last Update:2018-07-26 Source: Internet

Author: User

Tags generator

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The last stage is the creation of the target code, which is run on the target machine, by the intermediate Code optimization, semantic analysis---syntax analysis, lexical analysis, and so on. The task of the target code generation phase is to convert the previous intermediate code into machine language or Assembly languages on a particular machine, which is called the code generator.

1. The relationship between program portability and compiler module design
The reason for the compilation principle into this multi-stage multi-module organization, the essence of the consideration actually only two aspects:
One, code reuse: As far as possible without increasing the workload of the programmer to increase the portability of the application.
(But we know the machine instruction set and the hardware structure of different machines, and the above mentioned in order to improve the execution of the program to the maximum, it is obvious that the software needs to be optimized according to the hardware characteristics as much as possible, in fact, the development trend of AI chips is another aspect of this idea.) Today AI chips in order to accelerate deep learning for the super strength computational efficiency of the code, there is a specific program architecture customized chip requirements, such as the development of CPU->GPU, is the GPU to meet the deep learning single-wheel multiple simple operation needs,cpu-> The matrix operations of the FPGA focus on another example of a Facebook-led hardware-binding specific software computing requirement, and the most extreme is the current Google-launched TPU computing chip, completely eliminating unnecessary hardware devices, Based on Google's deep learning TensorFlow architecture Customized computing chip, it is the ultimate customization to get 200 times times higher than the CPU speed of significant results. ）。

So once again the "silver bullet" theory, if there is no problem to solve, then add a layer of abstraction. So Java, which improves the porting of code, is to separate the application layer from the specific hardware layer by introducing a JVM.

Figure 1. java-Multi-target machine
Obviously for Java programs can share a set of "lexical analysis + syntax analysis + semantic analysis + Intermediate code optimization Engine" This set of front-end suite, and for different models of the custom target code generator can be the official channel support and update to assemble their own required back-end suite group. This allows the programmer to have as much free space as possible, greatly improving code portability, rather than a DLL library written in C + +, which is written with great care, even with complex COM rules, but may also cause portability to pass because of different compiler versions.

Second, the specific machine optimizer and generator reuse: Target code generation and optimizer for a specific machine is extremely complex, has a successful, of course, to reuse it as much as possible.

Because the code generation phase of the target code and the structure of the specific computer, such as instruction format, Word length and the number and type of registers, and the semantics of the instruction and the operating system are closely related, especially the high-level language semantic function is complex, And the multiplicity of computer hardware structure brings great complexity to the theory research of code generation, so it is very difficult to realize it practically. So it is very rare to generate a backend code generator, of course, want it to be independent, be assembled many times to participate in other compiler production process.

Figure 2. Multiple languages-a target model

This situation is known as the computer situation: When a computer is qualified, and the high-level language for a variety of cases of the design of the intermediate language, should be able to fully reflect the characteristics of the qualified computer called MSIL (Machine Specific intermediate language). All compilers of this machine generate MSIL during the analysis phase, and when implementing a compiler, try to put a lot of work of the compilation process in the code generation stage, that is, the MSIL to the target program translation, in order to alleviate the analysis of different language translation tasks. Because no matter how many high-level languages, MSIL to the target program code generation only need to do once.

Of course, it is this kind of organizational characteristics, so that the group will be fighting the compiler generated work, now become no longer difficult to match. It also makes it possible for businesses to customize their own industry's own language (Domain specific Language), depending on their needs and priorities.

2. Code generation optimization means: Register allocation
Speaking so much, in fact, just need to understand that the code generator is a combination of target machine platform customization of a set of back-end packages, which is why the industry's mainstream computers are in accordance with x86, x64_86 these industry standards to design, if your company a way to design a set of hardware system, even if you can design and production, But are you sure there will be a downstream supplier for you to design a kit that resembles a code generator?

The quality of the final generated target code is measured in two ways: space occupancy and operational efficiency, which involve details of many and specific hardware bindings, most of which are difficult and not universal. The use rules of register is a few common means, so we can analyze the target code optimization and generation process by analyzing the register allocation.

Q: Why should you consider making full use of the registers when generating code?
A: Because when the value of a variable exists, the value of the referenced variable can be taken directly from the register, reducing the number of accesses to the memory, thus increasing the speed of operation. So how to make full use of register is an important way to improve the operation efficiency of target code.

Q: What is the principle of register allocation?
A: (1) As far as possible within the logical valid range: When generating the target code for a variable, try to keep the value of the variable or the result of the calculation in the register until the register is not sufficient to allocate.
(2) logic block exit, and in-memory source data synchronization: When the basic block exit, the value of the variable is stored in memory, because a basic block may have multiple successive nodes or multiple precursor nodes, the same variable name in the base block of different precursor nodes stored r may be different, or no fixed value, Therefore, the contents of the register should be placed in memory before the exit so that the value of the variable from the base block is in memory
(3) Timely release, improve register efficiency: The register used for variables that are no longer referenced in a base block should be released as early as possible to improve the efficiency of register utilization.

You can see that the LRU policy is used in the cache retirement update, and the LRU strategy is a classic "linear mindset"-that is, used in the past and will be used in the future. But here registers can not adopt "linear thinking", because we see our code all know, although the use of variables in the local scope is also consistent with the "local principle", but slightly relaxed to a certain scale of the use of variable sets, then found that more is unordered. So in the study of register allocation strategy, can only be compared to rough point, the target code is traversed again in advance, the use of variables in advance collation, and the so-called "information chain" data structure to save each variable usage. This strategy is not the latter-view inference logic of LRU, but rather the prophetic logic of cheating, but considering that once the use of the target code to maximize the efficiency of the Register, it is undoubtedly once and for all, and therefore cost-effective.

A method for calculating the variables of the information chain to be used
According to the use principle of the register, we can see that the allocation of registers is based on the basic block, because the basic block as the minimum unit of program flow, there is the problem of data synchronization and asynchronous, so when register allocation, the scope of the code to be audited only need to involve the current basic block.

First, set up two information chains for any variable: the information chain to be used and the active variable information chain.

Given the ease of handling, it can be assumed that the variables in the base block are active at the exit, while the temporary variables within the base block can be processed in two cases.
A) in general, temporary variables within the base block are considered inactive at the exit.
b) These temporary variables are also active if the algorithm at the intermediate code generation allows certain temporary variables to be referenced outside the base block.
　　
Calculate the use information chain of variables within the basic block (feel that the stack is more in line with the update of this information chain), the steps are as follows:

① the "Pending Information" column and the "Active Information" column in the symbol table for each basic block, the "Pending Information" column is set to "inactive" and the "Active Information" column is active at the base block exit and is "active" or "inactive". The variables are now assumed to be active, and the temporary variables are non-active.
The ② comes backwards from the basic block exit to the basic block inlet, which is then processed each four-yuan in turn. For each four-tuple i:a:=b op C, perform the following steps in turn:
A) Append the pending and active information of variable A in the symbol table to the four-tuple I.
b) Set the pending and active information columns for variable A in the symbol table to be "non-active" and "inactive" respectively. Since the fixed value of a in I can only be quoted after I of the four-yuan formula, thus to the four before I, A is inactive and can not be used.
c) Append the pending and active information of B and C in the symbol table to the four-tuple I.
D) Set the pending information column for B and C in the symbol table to "I" and the active Information Bar to "active".

Excuse me, for example:

(1) T:=a-b
(2) u:=a-c
(3) V:=t+u
(4) D:=v+u

After adding the information chain, the mark is as follows

(1) T "(3) L": = A "(2) L"-B "FL"
(2) U "(3) L": = a "fl"-C "FL"
(3) V "(4) L": = T "FF" + U "(4) L"
(4) D "FL": = V "FF "+ U" FF "

In this way, according to the Information Link table method, when each operation to an expression, if the number of registers is not enough, that is, no free register is available, you can traverse the current in the register of variables to be used information chain, and then select the next furthest will be called the variable to release its occupied register, the algorithm of the allocation Register:

① if the current value of B is in a register RI, and the register contains only the value of B, or B is the same identifier as a, or B is no longer referenced after the four-tuple, you can select RI as the desired register R, and turn (4);
② If there is a register that has not been assigned, choose an RI as the desired register R, and turn (4);
③ Select an Ri from the allocated register as the required register r, the selection principle is: The variable value that occupies the register is in main memory at the same time, or the location referenced in the base block is the farthest, so that the register Ri contains variables and variables in memory must first make the following adjustments: that is, to Rvalue[ri] Each variable m in, if M is not a and avalue[m] does not contain m, the following processing should be done;
A) generate the target code St ri,m the value of the variable that is not a is fed into memory by the Ri;
b) If M is not B, then make avalue[m]={m}, otherwise, make avalue[m]={m, Ri};
c) Remove m from Rvalue[ri];
④ gives R, returns.

At this point, it can be seen that the simple register allocation is the need for more data structure and work time consumption, glimpse, you can see the code generator is a part of the workload is very complex.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More