Http://zhiwei.li/text/2009/04/elf%E7%9A%84got%E5%92%8Cplt%E4%BB%A5%E5%8F%8Apic/
The shared libraries in ELF format use PIC technology to make code and data references irrelevant to addresses. programs can be loaded to any location in the address space. PIC does not use absolute addresses for code redirection and branch commands. PIC creates a global offset table got that stores all global variables and pointers in the Data Segment of the elf executable image.
For global variables and global functions referenced outside the module, the table items in the got table are used as addresses for indirect addressing. For static variables and static functions in the module, the first address of the got table is used as a benchmark and the offset relative to the benchmark is used for reference, because no matter what address space the program is loaded, the distance between the static variables and static functions in the module and the got is fixed, and the distance between them can be known at the link stage. In this way, PIC uses got to reference the absolute address of variables and functions, and redirects the reference with independent locations to the absolute location.
For the PIC code, there is no relocation item in the code segment, and the actual relocation item is only in the got table of the Data Segment. The relocation types in the shared target file include r_1__relative, r_1__glob_dat, and r_1__jmp_slot, it is used to relocate static data of pointer type, global variable symbol address, and global function symbol address when the dynamic linker loads the ing shared library or module.
1.2 PLT table
The process chain table is used to redirect function calls with independent locations to Absolute locations. Programs dynamically linked through PLT support the inert binding mode. Each dynamically linked program and shared database has a PLT. Each item in the PLT table is a small piece of code, which corresponds to a global function to be referenced in this running module. The program's access to a function is adjusted to the access to the PLT entry.
Each PLT entry corresponds to a got entry. The execution function is actually directed to the address of the got entry storage. The initial value of this got entry is the push instruction address (the next instruction of JMP) in pltn, so 1st jumps do not have any effect). After the symbol is parsed, the real address of the symbol will be stored. When loading the ing shared library, the dynamic linker sets two special values in got: Set the link_map address of the dynamic library ing information data structure in got + 4 (that is, got [1; set the dynamic linker Symbolic solution in got + 8 (that is, got [2 ])
The address of the analysis function _ dl_runtime_resolve.
Plt0 is a special code used to access the dynamic linker. The program forwards 1st accesses to plt0, and finally jumps to the address stored in got [2] to execute the symbolic parsing function. After the symbol resolution is completed, the actual address of the symbol is saved to the corresponding got item, so that you can directly jump to the actual function address when calling the function, without executing the symbol resolution function.
When the operating system runs the program, first map the interpreter program, that is, the dynamic linker lD. So, to an appropriate address, and then start lD. So. LD. So first completes initialization, then searches for the desired library from the path name specified in the dynamic library dependency table of the executable file, and loads it To the memory.
Linux uses a global database ing information structure struct link_map linked list to manage and control the loading of all dynamic databases. The loading process of dynamic databases is actually to map library files to the memory, and fill in the process of adding the library ing information structure to the linked list. The structure struct link_map describes the load ing information of the shared target file. It is a structure used internally by the dynamic linker during runtime and maintains a trace of the symbols in the loaded library and library.
Link_map uses two-way link middleware "l_next" and "l_prev" to link all the loaded shared libraries in the process. When the dynamic linker needs to search for symbols, it can traverse the linked list forward or backward and search for the desired symbols by accessing each database on the linked list. The link_map Linked List entry is directed by the 2nd entry (got [1]) of the global offset table of each executable image. When searching for a symbol, the link_map Node Address is read from got [1, search by link-map node.
The ing process of a dynamic library consists of three steps:
(1) the dynamic linker calls the _ MMAP function to map all the pt_load loading segments of the dynamic library as a whole:
Rochelle map_start = (elfw (ADDR) _ MMAP (void *) 0, maplength, Prot,
Map_copy | map_file, FD, mapoff );
The returned value Rochelle map_start is the virtual address of the actual ing. It is not necessarily the same as the virtual address specified by the segment structure member p_vaddr, which does not affect the location-independent code. However, the location descriptions of the data segment and the link_map structure must be modified. 1. l_addr is the difference between the actual ing address and the original specified ing address. It is used to correct other location information, you can simply add the original virtual address with l_addr to get the actually loaded virtual address.
(2) After the shared file ing is completed, the dynamic linker processes the pt_dynamic dynamic segment of the shared library, enter the addresses of dynamic link information, such as hash table, symbol table, string table, relocation table, and PLT relocation item table, in the l_info Array Structure of link_map. L_info is one of the most important fields of link_map. Almost all content related to Dynamic Link Management is related to the l_info array. The dynamic linker also loads all dependent libraries that process the current shared library.
(3) because the actual ing address and the specified virtual address may be different, dynamic libraries and their dependent libraries must be relocated. Set 1st and 2nd got table items for the dynamic library:
Elf32_addr * Got =
(Elf32_addr *) lmap-> l_info [dt_pltgot]. d_un.d_ptr;
Got [1] = lmap;
Got [2] = & _ dl_runtime_resolve;
Relocate all relocation items of the dynamic library, and add the correction value l_addr to the offset address specified by the relocation item. The dynamic item dt_rel provides the address of the relocated table, and dt_relsz provides the number of relocated table items.
After the ing is completed, the dynamic linker calls the initialization function provided by the shared library (including all relevant dependent libraries) for initialization.
Procedure linkage table allows an infected file to call external functions. This is much better than modifying the ld_preload environment variable to implement the redirection of the call. First, the environment variable is not modified.
Program connection table (PLT)
In the ELF file, the global Offset Table (got) can locate the location-independent address to the absolute address, and the program connection table also plays a similar role, it can direct location-independent function calls to absolute addresses. The link editor cannot resolve the transfer of a program from an executable file or shared library target to another execution. As a result, the connection editor can only arrange some entries containing the transfer control to the program connection table (PLT. In System V, the program connection tables are located in the shared body, but they use the addresses in the private global offset table. Dynamic connectors, such as ld-2.2.2.so, determine the absolute address of the target and modify the image of the global offset table in memory. As a result, dynamic connectors can redirect these entries without damaging the location independence and sharing features of the program body. The executable files and shared target files have their own program connection tables.
The dynamic Connection Library of ELF is independent of the memory location. That is to say, you can load the library to any location in the memory without affecting it. This is called position independent. Add the-FPIC option to the compiler when compiling a dynamic connection library with no memory location, make the target file generated by the compiler independent of the memory location and minimize the absolute address used for variable reference. Compiling a database into a memory location is independent of the memory location. The Compiler reserves a register to point to the global Offset Table (or got for short )), this will cause the compiler to use less than one register when optimizing the code, but in the worst case, this performance is reduced by only 3%, and in other cases it is greatly less than 3%.