The problem with the compilation of the code a multitude of things to say, first of all, how the computer handles the application, in essence, the application through the operating system to apply machine instructions to manipulate the hardware facilities to complete a variety of tasks, from the compilation of the link to start talking about it, it is known that The code written by the program developer actually has no way to get to know the computer, then it must be compiled into machine instructions that can be understood by the computer, where the operating system allocates memory processor segments from hardware to specific instructions. The following is a simple procedure for compiling, compiling, compiling, and linking.
2.1 Compilation preprocessing
At this stage, the macro definition is expanded, as well as the recursive processing of the header file, which expands all the compile commands that begin with #.
2.2 Compilation phase
The program code snippet is cut by character stream format, processing, mainly lexical analysis, grammar analysis, semantic analysis and other stages, the compilation is completed after the generation of intermediate code.
2.3 Assembly
The compiled intermediate code is generated by the Assembler module to generate machine instructions that the computer can recognize to manipulate the hardware facility to generate the target code (relocatable target code).
2.4 Links
Link processing of various object codes and library files (*.lib files), resource files (*,REC) through the linker module to eventually generate *.exe files that can be executed.
2.5 Relocation Issues
In one example, if we have two header files and two source files called Function1.h and Function2.h, and Function1.cpp and function2.cpp files, the Function1.h content is as follows
Function1.h
#ifndef _function1_h
#define _function1_h
Int G_val;
int Add (int m, int n);
#endif
Function1.cpp
g_val=10;
int Add (int m, int n)
{
Return M+n;
}
Function2.cpp contains the contents of the main function as follows
#include "Function1.h"
Int Main ()
{
Int l_valfri=3;
Int l_valsec=4;
g_val=14;
Int Result=add (L_VALFRI,L_VALSEC);
Return 0;
}
For such a code compiler to compile function2.cpp for external symbols g_val and external function add how to resolution, here again will be related to the relocatable file in the symbol table problem.
In fact, in the relocatable target file there will be a symbol table to place the variable and its entry address, when the definition of the symbol can be found in the compilation process to update the symbol entry address to the symbol table or the address of the symbol is not made any resolution has been preserved to the link stage processing. See the structure of the symbol table in two examples.
The symbol table in the relocatable target file of the Function1.cpp file during compilation is as follows
Variable name |
Memory address |
G_val |
0x100 |
Add |
0x200 |
|
|
Why is it possible to allocate memory addresses for symbols g_val and add, because their definitions can be found in the Function1.cpp file at compile time, so that explicit memory address assignment can be made.
Then look at the structure of the relocatable target file generated by Function2.cpp:
Variable name |
Memory address |
G_val |
0x00 |
Add |
0x00 |
Why such a situation arises. Because the declarations of these symbolic variables can be seen during the compilation phase, but their definitions are not found, the compiler is stuck in a pending situation.
When the include file is expanded, Function2.cpp will probably be this way obviously only the declaration of the symbolic variable is not defined.
#ifndef _function1_h
#define _function1_h
Int G_val;
int Add (int m, int n);
#endif
Int Main ()
{
Int l_valfri=3;
Int l_valsec=4;
g_val=14;
Int Result=add (L_VALFRI,L_VALSEC);
Return 0;
}
They are stored in the symbol table first, but they are not being associated with memory for them until the link stage is processed.
Relocation occurs at the target Code link stage, and the linker looks for the symbol table in the link stage. When he found the memory address of the function2.cpp symbol table with no resolution, the linker would look for all the target code files until he found the real memory address of the Function1.cpp-generated target code file symbol table, which is the function The target code file generated by 2.cpp updates its symbol table and writes the memory address of those symbol variables that have not yet been resolved into its symbol table.
Function2.obj file symbol table after update
Variable name |
Memory address |
G_val |
0x100 |
Add |
0x200 |
|
|
When all symbolic variables are able to find a valid memory address, the link phase relocation is complete.
On the problem of reconstruction process and symbol table relocation