turn from: http://www.ibm.com/developerworks/cn/linux/l-dynlink/#ibm-pcon
Basic principles of linker and loader
A program that wants to run in memory is linked and loaded in addition to the compilation. From a programmer's point of view, the advantage of introducing these two steps is that you can use the meaningful function name and variable names of printf and errno directly in your program without explicitly specifying the addresses of printf and errno in the standard C library. Of course, compilers and assemblers have made a revolutionary contribution to freeing programmers from the nightmare of early direct use of address programming. The advent of compilers and assemblers allows programmers to use more meaningful symbols in their programs to name functions and variables, so that the program is greatly improved in terms of correctness and readability. However, with the popularity of C language, which supports separately compiled programming languages, a complete program is often divided into several separate parts in parallel development, and each module communicates through function interface or global variables. This brings a problem, the compiler can only within a module to complete the conversion of symbol name to address, the different modules of the symbolic resolution by WHO to do it. For example, the user program that invokes printf and the standard C library that implements printf are obviously two different modules. In fact, this work is done by the linker.
In order to solve the link between different modules, the linker has two main tasks to do-symbol resolution and relocation:
Symbolic parsing: When a module uses a function or global variable that is not defined in the module, the compiler-generated symbol table will mark all such functions or global variables, and the linker's responsibility is to go to other modules to find their definition, if the appropriate definition is not found or the appropriate definition is not unique, Symbolic parsing does not complete correctly.
Relocation: The compiler typically uses a zero-based relative address when compiling a build target file. However, in the link process, the linker will start at a specified address and assemble them one after another according to the order of the entered destination files. In addition to the Assembly of the target file, in the relocation process also completed two tasks: one is to generate the final symbol table, and the second is to modify certain locations in the code snippet, all the locations that need to be modified by the compiler generated by the relocation table.
To give a simple example, the above concept is clear to the reader. If we have a program consisting of two parts, the main function in M.C calls the function sum implemented in F.C:
/* M.C
/int i = 1; int j = 2; extern int sum ();
void Main ()
{
int s;
s = SUM (i, j);
/* F.C
/int sum (int i, int j)
{return
i + j;
}
|
In Linux, use GCC to compile two source programs into the target file:
Let's take a look at the symbol table and the relocation table that was generated during compilation by Objdump:
$ objdump-x m.o ...
SYMBOL TABLE:
...
00000000 G o. Data 00000004 i
00000004 g o. Data 00000004 J
00000000 G F. Text 00000021 main
00000000 *und* 00000000 sum
relocation RECORDS for [. Text]:
OFFSET TYPE VALUE
00000007 r_386_32 J
0000000d r_386_32 i
00000013 r_386_pc32 sum
|
First, we notice that the sum in the symbol table is labeled und (undefined), which is not defined in M.O, so that in the future, the symbol parsing function of the LD (the Linux linker) will be used to find out if there is a definition of function sum in the other module. In addition, there are three records in the relocation table that indicate the three locations in the code snippet that need to be modified in the relocation process, at 7, D, and 13 respectively. Here's a more intuitive way to look at these three locations:
$ objdump-dx m.o
disassembly of section. Text:
00000000 <main>:
0: push %ebp
1: e5 mov %esp,%ebp
3: EC Sub $0x4,%esp
6: A1 mov< C18/>0x0,%eax
7:r_386_32 J
B: push %eax
C: A1 mov 0x0,%eax
d:r_386_32 i
: push %eax
: E8 FC FF FF FF call <main+0x13>
13:r_386_pc32 sum
: c4 add $0x8,%esp
1a: c0 mov %eax,%eax
1c: %EAX,0XFFFFFFFC FC mov (%EBP)
1f: C9 leave
: c3 ret
|
In sum, for example, calls to the sum of functions are implemented through the call instruction, using the IP relative addressing method. As you can see, in the target file M.O, the call instruction is located at the zero-based relative address 12, where the E8 is the call opcode, and the 4 bytes starting at 13 hold the offset of sum relative to call's next instruction Add. Obviously, this offset is not known before the link, so you'll want to modify the code here in the future 13. So why are there 0XFFFFFFFC (note that Intel's CPU uses little endian)? This is probably for security reasons, because 0XFFFFFFFC is the complement representation of 4 (the reader can use p/x-4 view in gdb), and the call instruction itself occupies 5 bytes, so the offset in the call instruction cannot be-4 anyway. Let's look at what this offset in the call instruction has been modified after the relocation:
$ gcc m.o f.o
$ objdump-dj. Text A.out | Less
disassembly of section. Text:
...
080482C4 <main>: ...
80482D6: E8 0d call 80482e8 <sum>
80482db: c4 add $0x8,%esp .....
080482e8 <sum>:.
|
You can see that after repositioning, the offset in the call instruction is modified to 0x0000000d, and the simple calculation tells us: 0x080482e8-0x80482db=0xd. In this way, the final executable program is generated after the relocation.
After the executable program is generated, the next step is to load it into memory run. Linux compiler (C language) is CC1, assembler is as, linker is LD, but does not have a real program corresponding to the concept of loader. In fact, the function of loading an executable program into memory is implemented by the system call of Execve (2). In simple terms, the loading of the program consists of the following steps: Read the header information of the executable file to determine the size of its file format and address space, divide the address space in the form of segments, read the executables into each segment of the address space, and establish the mapping relationship between the virtual and the actual addresses; Chingqing the BBS to 0; create the stack segment; Set up the program parameters, environment variables, such as the operation of the process of the information required to start the run.
Back to the top of the page
The history of linking and loading technology
A program to load memory running must first be compiled, linked and loaded into these three stages, although it is such a familiar concept, in the process of operating system development has undergone many major changes. In simple terms, it can be divided into the following three phases:
1. static linking, static loading
This method was first adopted, characterized by simplicity, and does not require any additional support from the operating system. Programming languages like C have been compiled from early on, and different modules of the program can be developed in parallel, and then independently compiled into the corresponding target files. After all the target files have been obtained, the static link, static loading approach is to link all the target files into an executable image, and then load the executable image into memory all at once when the process is created. For a simple example, suppose we have developed two programs Prog1 and Prog2,prog1 are made up of main1.c, UTILITIES.C, and errhdl1.c, corresponding to the main frame of the program, some common auxiliary functions (which act as libraries), and the error-handling part , the three parts of the code are compiled to each corresponding target file main1.o, UTILITIES.O and ERRHDL1.O. Similarly, the Prog2 is composed of main2.c, UTILITIES.C and errhdl2.c, and three parts of the code are compiled with corresponding target files main2.o, UTILITIES.O and ERRHDL2.O respectively. It is worth noting that here Prog1 and Prog2 use the same common auxiliary function UTILITIES.O. When we use the static link, static loading method, the memory and hard disk usage as shown in Figure 1 when running both programs are as follows:
Figure 1
You can see that, first of all, the use of the hard disk, although the two programs share the use of utilities, but this is not in the hard drive to save the executable program image reflected. Instead, UTILITIES.O is linked to the executable image of every program that uses it. The same is true for memory, when the operating system loads the executable image of the program all at once into memory when the process is created, before the process can begin to run. As mentioned earlier, using this method makes the implementation of the operating system very simple, but its drawbacks are obvious. First of all, since two programs use the same UTILITIES.O, it should suffice to keep a copy of UTILITIES.O on the hard drive, and if the program does not have any errors during its operation, then the code for the error-handling section should not be loaded into memory. Therefore, static link, static loading method not only wastes the hard disk space, but also wastes the memory space. Because of the valuable memory resources of early systems, the latter is more lethal to early systems.
2. Static links, dynamic loading
Since static linking and static loading methods do more harm than good, let's take a look at how people solve this problem. Because the problem of memory tension in the early system is more prominent, so the first thing people think is to solve the problem of low memory use efficiency, so the idea of dynamic loading. The idea is very simple, that a function can only be loaded into memory if it is called. All modules are stored on disk in a relocatable loading format. First, the main program is loaded into memory and starts running. When a module needs to call a function in another module, first check to see if the module containing the called function is loaded into memory. If the module has not been mounted in memory, the module is loaded into memory by the link loader responsible for relocation, and the program's address table is updated to reflect this change. After that, the control is transferred to the function called in the newly installed module.
The advantage of dynamic loading is that you never mount a module that is not being used. If there is a lot of code in the program that handles small probability events like an error-handling function, this approach is certainly effective. In this case, even though the entire program may be large, the part that is actually used (and therefore loaded into memory) may actually be very small.
Still, for example, the two programs mentioned above, Prog1 and PROG2, if an error occurred during the Prog1 run and Prog2 No errors occurred during the run. When we use the static link, the dynamic loading method, the memory and the hard disk usage when running these two programs are as shown in Figure 2:
As you can see, when there are many modules in the program, such as error handling, the use of static link, dynamic loading method in the use of memory efficiency on the show a considerable advantage. So far, people have moved on to the ideal goal, but the problem has not been fully solved-the use of memory efficiency has increased, the hard disk.
3. Dynamic linking, dynamic loading
The use of static link, dynamic loading method seems to be left only the hard disk space usage efficiency is not high, in fact, the problem of inefficient memory usage is still not fully resolved. In Figure 2, since two programs use the same UTILITIES.O, the ideal scenario is to keep only a copy of the UTILITIES.O in the system, whether in memory or on a hard drive, so people think of dynamic links.
When using dynamic links, you need to play a stub in each of the program mappings where the library function is invoked. A stub is a small piece of code that locates the appropriate library that is loaded into memory, and if the required library is not in memory, the stub will indicate how to load the library in which the function resides.
When a stub is executed, first check to see if the required function is already in memory. If the required function is not already in memory, you first need to mount it. Anyway, the stub will eventually be replaced by the address of the calling function. This allows the same library function to run directly the next time the same code segment is run, eliminating the extra overhead of dynamic linking. As a result, all processes that use the same library are running with the same copy of the library.
Let's take a look at the two programs mentioned above Prog1 and PROG2 are using dynamic linking, dynamic loading, and the memory and hard disk usage when running both programs (see Figure 3). It is still assumed that an error occurred during the Prog1 run and that Prog2 no errors occurred during the run.
Figure 3 Using dynamic linking, dynamic loading methods, memory and hard disk usage while running PROG1 and Prog2
In the figure, there is only one copy of the UTILITIES.O in both the hard disk and the memory. In memory, two processes are shared by mapping addresses to the same UTILITIES.O. This feature of dynamic linking is essential for library upgrades, such as bug fixes. When a library is upgraded to a new version, all programs that use this library will automatically use the new version. If you do not use dynamic link technology, all of these programs need to be relink to access the new version of the library. To prevent programs from accidentally using some incompatible new versions of the library, you typically include your own version information in both the program and the library. There may be several versions of a library in memory, but each program can use version information to determine which one it should be using. If only minor changes are made to the library, the version number of the library remains unchanged, and if the changes are large, the version number is incremented accordingly. Therefore, if the new library contains earlier incompatible changes, only those programs that use the new library for compilation will be affected, and programs that have been linked before the new library installation will continue to use the previous libraries. Such a system is called a shared library system.
Back to the top of the page
Implementation of dynamic link under Linux
The libraries (like LIBC, QT, and so on) that we program in Linux are now available in both dynamic-link libraries and two-version libraries of static-link libraries, and GCC uses the dynamic-link library in the system when compiling links without the-static option. The principle of dynamic link library Most of the books are just a general introduction, the author will be in the actual system of the disassembly of the code to the reader to show the implementation of this technology under Linux.
Here is the simplest C program hello.c:
#include <stdio.h>
int main ()
{
printf ("Hello, world\n");
return 0;
}
|
Under Linux we can compile it into an executable file using GCC a.out:
The program uses printf, which is located in the standard C library, if you compile with gcc without-static, the default is to use the libc.so, that is, dynamic link standard C library. In GDB, you can see that the compiled printf corresponds to the following code:
$ gdb-q a.out
(gdb) disassemble printf
Dump of assembler code for function printf:
0x8048310 <printf>:< c4/>jmp *0x80495a4
0x8048316 <printf+6>: push $0x18 0x804831b <printf+11> JMP 0x80482d0 <_init+48>
|
This is also the piling process, which is usually in the books and mentioned above, which is obviously not a real printf function. The stub code works by looking for real printf in libc.so.
(gdb) x/w 0x80495a4
0x80495a4 <_global_offset_table_+24>: 0x08048316
|
You can see that the 0X80495A4 store 0x08048316 is PUSHL $0x18 This instruction address, so the first JMP directive does not play any role, its function is like the null operation instruction NOP. This is, of course, the first time we call printf, its real role is reflected in the future invocation of printf again. The purpose of the second JMP instruction is the PLT, which is the procedure linkage table, whose contents can be viewed through the objdump command, and we are interested in the following two instructions that affect the control flow of the program: