Abstract: This article introduces the advantages of the dynamic connection library, and describes in detail the compiler of the Linux system on the X86 architecture.
, Connector, loader how to use a variety of relocation methods to achieve this function
Key words: Dynamic Connection Library; LINUX; relocation
The implementation mechanism of DLL under Linux
[Abstract] In this paper, we discuss the advantage of using dynamic linking
. We also demonstrate in detail how compiler, linker and loader implement th
Is feature by using several kinds of relocations under nowadays Linux System
, Especially on x86 ubuntures.
[Keywords] dynamic link library; DLL; LINUX; relocation
The concept of dynamic connection library for Linux and Windows is similar, but the implementation mechanism is different. It introduces
Concept, combined with a variety of relocation items, to achieve "floating code", to achieve better sharing performance. This article describes
These technologies are discussed in detail one by one.
This article focuses on the X86 architecture, because (1) among the various operating Linux architectures, x86 is the most popular
(2) Windows operating systems in this architecture are widely known, so that you can easily understand the similar concepts of Linux
Nian;
The following table lists synonyms of Windows and Linux:
Windows Linux
Dynamic connection library (DLL) shared object
The name of the target file (. OBJ) usually ends with. O.
Executable File (.exe) executable (the file name has no specific Identifier)
Connector (link.exe) linker Editor (LD)
Loader (exec/loader) dynamic linker (ld-linux.so)
Segment (segment) Section)
Some keywords have specific meanings in this article and need to be clarified:
Compilation unit: a c language source file. After compilation, a target file is generated.
Running module: a dynamic connection library or an executable file. Module for short
Automatic variables and functions: Objects modified by the C language auto keyword
Static variables and functions: Objects modified by the static keyword in C Language
Global variables and functions: Objects modified by the C language extern keyword
1. Advantages of dynamic database connection
Programming generally requires several steps: editing, compiling, connecting, loading, and running. Some public codes need to be reversed.
Re-use, compile them into the target file in advance and save them in the "library. When it is connected to the target file of the user program
During connection, the connector must select the Code required by the user program from the library and copy it to the generated executable file. This
A library is called a static library, which features a complete copy of the library code in the executable file. Obviously, when static
When the database is used by multiple programs, multiple redundant copies are made on the disk and in the memory.
Dynamic Connection Library is used to overcome this defect. When it is connected to the target file of the user program, the connector is only
The above mark indicates that the program needs to dynamically connect to the database, instead of copying the library code to the executable file. Only when
When the executable file is running, the loader checks whether the library has been loaded into other executable files based on this tag.
Memory. If it already exists in the memory, you do not need to load it from the disk, as long as you share the existing code in the memory.
In this way, there is always only one piece of code in the disk and memory, which is better than the static database.
2 important features of Linux dynamic Connection Library: Floating code
In Windows, you must specify a first address when generating a dynamic Connection Library. When the application is running
The dynamic connection library may be loaded to this address. If the address is occupied, the dynamic connection library can only be loaded to other
In the address space, it is necessary to fix the code and data in the library, or call it relocation. In this way
When multiple instances are located too heavily in the memory, they will be different from each other and will naturally no longer be shared. To avoid this
Defects. Windows libraries all specify overlapping addresses. However, the products of other software vendors still
It is inevitable that overlapping addresses are used, and thus the benefits of using dynamic connection libraries are partially lost.
In Linux, in order to achieve better shared performance, the following policy is used: Floating code (PO
Sition independent code (PIC ). Specifically, the transfer commands used are relative to the current program.
The offset of the counter (IP). The address of the referenced variable and function in the Code is the offset relative to a base address.
. In short, never reference an absolute address. In this way, no matter what address space the dynamic Connection Library is loaded
Fix the code to work properly. Since there is only one piece of code, it is easy to achieve sharing.
It is worth noting that the sharing mentioned here refers to the dynamic connection library code segment,
A unique image of a read-only data segment in the memory. Another common definition of shared data segment is that multiple processes can
Can be dynamically allocated) read and write in the storage area to implement inter-process communication (IPC ). The latter sharing definition is not available in this article
When the executable file is running, the loader checks whether the library has been loaded into other executable files based on this tag.
Memory. If it already exists in the memory, you do not need to load it from the disk, as long as you share the existing code in the memory.
In this way, there is always only one piece of code in the disk and memory, which is better than the static database.
2 important features of Linux dynamic Connection Library: Floating code
In Windows, you must specify a first address when generating a dynamic Connection Library. When the application is running
The dynamic connection library may be loaded to this address. If the address is occupied, the dynamic connection library can only be loaded to other
In the address space, it is necessary to fix the code and data in the library, or call it relocation. In this way
When multiple instances are located too heavily in the memory, they will be different from each other and will naturally no longer be shared. To avoid this
Defects. Windows libraries all specify overlapping addresses. However, the products of other software vendors still
It is inevitable that overlapping addresses are used, and thus the benefits of using dynamic connection libraries are partially lost.
In Linux, in order to achieve better shared performance, the following policy is used: Floating code (PO
Sition independent code (PIC ). Specifically, the transfer commands used are relative to the current program.
The offset of the counter (IP). The address of the referenced variable and function in the Code is the offset relative to a base address.
. In short, never reference an absolute address. In this way, no matter what address space the dynamic Connection Library is loaded
Fix the code to work properly. Since there is only one piece of code, it is easy to achieve sharing.
It is worth noting that the sharing mentioned here refers to the dynamic connection library code segment,
A unique image of a read-only data segment in the memory. Another common definition of shared data segment is that multiple processes can
Can be dynamically allocated) read and write in the storage area to implement inter-process communication (IPC ). The latter sharing definition is not available in this article
Off.
3 Implementation Mechanism of Linux dynamic Connection Library: Relocation
3.1 relocation Overview
Floating code can be implemented through the relocation operation. Relocation can be classified according to multiple criteria:
-- Based on the location where the data segment (. Text) is located and relocated to the Data Segment (. Data.
-- Based on the time of occurrence, it can be divided into connection-time relocation and loading-time relocation (relocation during loading is also known as dynamic resetting)
Bit ). However, these two steps are not always indispensable. For example, to implement floating code, the code segment cannot be dynamic.
In this case, the method is to move the item to be dynamically relocated to the data segment, and then in the code segment
Reference these items.
-- Objects referenced by relocation items can be divided into data reference and function reference. If static data or static data is referenced
Function, the connector will optimize the generated code and remove the dynamic relocation item.
-- Literally, Linux in the X86 architecture uses multiple relocation methods, with the prefix "r_386 _"
, Followed by: 32, got32, plt32, copy, glob_dat, jmp_slot, relative, gotoff,
Gotpc. Each method has a specific meaning.
The most important of the above categories is by location. In the following sections, we will take it as the main line to introduce various relocation methods one by one.
. First, two key concepts are introduced: Got table and PLT table.
3.2 got table
Each item in the got (Global Offset Table) table is a global variable or function to be referenced by this running module.
. You can use the got table to indirectly reference global variables and functions, or use the first address of the got table as a base
Quasi. Use the offset relative to the benchmark to reference static variables and functions.
Because the loader does not load the running module to a fixed address, in the address space of different processes,
The absolute address and relative location are different. This difference is reflected in the got table, that is, each running module of each process has
There are independent got tables, so got tables cannot be shared between processes.
In the X86 architecture, the first address of the got table in this running module is always stored in the % EBX register. The Compiler
A short code is generated at the function entry to initialize the % EBX register. This step is necessary. Otherwise, if
The call to this function comes from another operation module. % EBX is the got table address of the caller module.
% EBX is used to reference global variables and functions. Of course, an error occurs.
3.3 PLT table
Each item in the PLT (Procedure linkage table) table is a small piece of code.
. Taking the call to function fun as an example, the code snippet in PLT is as follows:
. Pltfun: JMP * Fun @ Got (% EBX)
Pushl $ offset
JMP. plt0 @ PC
The referenced got table item is initialized by the loader as the address of the next command (pushl), so the JMP command is equivalent
Empty command at NOP.
The call fun @ PLT command is generated after the user calls the fun directly in the program and the compiled connection. This is a relative hop.
Commands (meeting floating code requirements !), Jump to. pltfun. If this is the first time this module calls
Function, where JMP is equal to an empty command. Continue to execute the command and jump to. plt0. This PLT item is reserved for editing
The additional code generated by the interpreter will introduce the program flow into the loader. The loader calculates the actual entry address of fun.
, Fill in the fun @ got table item. The figure is as follows:
User Program
--------------
Call fun @ PLT
|
V
Dll plt table Loader
---------------------------------------------------
Fun: <-- JMP * Fun @ Got --> change got entry from
| $ Loader to $ fun,
V then jump to there
Got table
--------------
Fun @ Got: $ Loader
After the first call, the got table item has pointed to the correct entry of the function. We will call this function later and jump to PLT.
After the table is created, it does not enter the loader any more. directly jump into the correct function entry. From the performance analysis, only the first call is required.
It is totally tolerable to the loader for some additional processing. It can also be seen that the relative jump is not required during loading.
So the entire code segment can be shared among processes.
Programmers familiar with windows can easily notice that got tables, PLT tables, and Windows import tables have
. Other mappings are as follows: Linux version scripts and windows. Def files; Linux
Dynamic symbols section and Windows output table (export ). I will not give more examples.
3.4 code segment relocation
It should be noted that, as required by floating code, there should be no relocation items in the code segment. Here we just borrowed "in the code
Segment ", the actual relocation item is still in the got table of the Data Segment. Even so, it corresponds to the number of section 3.5"
The difference between "relocation in data segments" is obvious.
A) first address of the mounted got table
To use a got table, you must first know its first address. However, this first address will vary with the first address loaded by the running module.
But different. Linux uses a technique to obtain the correct first address of the got table during running. The code snippet is as follows:
The following lists the types of relocation items in the corresponding target file (. O) and dynamic Connection Library (. So:
Call L1
L1: popl % EBX
Addl $ got + [.-. L1], % EBX
. O: r_0000_gotpc
. So: NULL
As mentioned above, the code snippet exists at the entrance of each function. The first statement of the Program sets the current program counter (IP)
The value is pushed to the stack. In the second sentence, it is popped out from the stack. The result is equivalent to movl % EIP, % EBX, but it is valid.
% EIP is not allowed in the x86 Instruction Set. Then add % EBX to a got table and
Difference. This difference is a constant irrelevant to the first address loaded by the dynamic connection database. It can be obtained during connection. Entire Process
The C language is described as follows:
% EBX = % EIP;
% EBX + = ($ got-% EIP)
At this point, % EBX is equal to the first address of the got table.
The above process is the result of compilation and connection cooperation. When the compiler generates the target file, the got table does not exist.
(Each running module has a got table and a PLT table generated by the connector). Therefore, got tables and
The difference between the current IP addresses. Only the r_1__gotpc relocation tag is set in the third sentence. Then connect.
The connector notices the gotpc relocation item, so it calculates the difference between the got and the IP address here as the immediate addressing of the addl command.
Method operand. You do not need to relocate it any more.
B) reference variable and function address
When you reference static variables, static functions, or string constants, you can use r_1__gotoff to relocate them. It corresponds
The gotpc relocation method is very similar. Similarly, the compiler first sets the repositioning mark in the target file, and then the connector
Calculates the difference between the got table and the first address of the referenced element, and acts as the address change addressing method operand of the Leal command. Code snippet
As follows:
Leal. LC1 @ gotoff (% EBX), % eax
. O: r_0000_gotoff
. So: NULL
When a global variable or global function is referenced, the compiler will set a r_1__got32 relocation in the target file.
Mark. The connector retains one item in the got table. Note the r_1__glob_dat relocation tag for the loader to fill in
The actual address of the referenced element. The connector also calculates the offset of the reserved entry in the got table as the movl command
Address addressing method operand. The code snippet is as follows:
Movl x @ Got (% EBX), % eax
. O: r_0000_got32
. So: r_1__glob_dat
It should be noted that when a global function is referenced, The got table reads this function instead of the actual entry address of the global function.
Entry in the PLT table. pltfun (see section 3.3 ). In this way, no matter whether you call the function directly or obtain the function address first
After the call, the program flow is transferred to the PLT table, and the control is transferred to the loader. The loader uses this machine.
Will be dynamically connected.
C) directly call a function
As mentioned above, function call statements in floating code are compiled into relative jump commands. First, the compiler will
Set the r_1__plt32 relocation tag, and the connection process varies with the static and global functions.
Different.
For a static function, the call must come from the same running module. The call point offset from the function entry point is
It can be calculated at the time of connection. It acts as the Redirect operand of the Call Command relative to the current IP address offset, and thus directly enters the Function
Port. The related code snippets are as follows:
Call f @ PLT
. O: r_1__plt32
. So: NULL
If it is a global function, the connector will generate a relative jump command to. pltfun, as described in section 3.3.
The first call of the local function will transfer the program flow to the loader, then calculate the function entry address, and fill the Fu
N @ got table item. This is called r_1__jmp_slot relocation. The related code snippets are as follows:
Call f @ PLT
. O: r_1__plt32
. So: r_1__jmp_slot
As a result, a global function may have up to two relocation items. One is a required jmp_slot relocation item,
The loader points it to the real function entry; the other is the glob_dat relocation item, and the loader points it to the PLT table.
. When getting the function address, it always gets the value of the glob_dat relocation item, that is, pointing to. pltfun,
Instead of the real function entry.
Further consider the following problem: Two dynamically connected libraries, taking the address of the same global function, and performing the two results
Comparison. As we can see from the previous discussion, neither of the two results points to the real entry of the function, but to the two
The same PLT table. A simple comparison may lead to an "Unequal" conclusion, which is obviously incorrect, so special processing is required.
3.5 Data Segment relocation
In the data segment, relocation refers to the initialization of static and global variables of the pointer type. It corresponds
In comparison, there must be at least the following obvious differences: 1. Gain Control in the user program (the main function starts to execute
Rows). 2. Do not indirectly address the got table because % EBX is not correct yet.
The first address of the got table. 3. Modify the data segment directly. The code segment cannot be modified when the code segment is relocated.
If static variables, functions, and string constants are referenced, the compiler will place r_1__32 in the target file.
And calculate the offset of the referenced variable and function relative to the first address of the field. The connector changes it to r_1__re.
The lative relocation tag calculates its offset from the first address (usually zero) of the dynamic connected database. The loader will
Add the real first address (not zero) of the running module to the offset. The result is used to initialize the pointer variable. Code
The snippet is as follows:
. Section. rodata
. Lc0:. String "OK \ n"
. Data
P:. Long. lc0
. O: r_0000_32 w/Section
. So: r_1__relative
If global variables and functions are referenced, the compiler also sets the r_1__32 relocation tag and records the referenced
Symbol name. The connector does not need to be operated. Finally, the loader finds the referenced symbol, and the result is used to initialize the pointer variable.
For global functions, the search result is still the code snippet of the function in the PLT table, rather than the actual entry. This corresponds to
The global functions referenced by a plane are discussed in the same way. The code snippet is as follows:
. Data
P:. Long printf
. O: r_0000_32 w/Symbol
. So: r_0000_32 w/Symbol
3.6 conclusion
The following table shows all the results discussed above:
. O. So
------------------------------------------------------------
| First address of the mounted got table r_0000_gotpc null
Code segment | -----------------------------------------------------
Relocation | reference variable function address static r_386_gotoff null
| Global r_1__got32 r_1__glob_dat
| -----------------------------------------------------
| Directly call the function static r_386_plt32 null
| Global r_1__plt32 r_1__jmp_slot
------ | -----------------------------------------------------
Data Segment | reference variable function address static r_1__32 w/sec r_1__relative
Relocation | global r_1__32 w/sym r_1__32 w/sym
------------------------------------------------------------
4 Conclusion
Windows uses the PE file format, while Linux uses the ELF file format, which is the root cause of different dynamic connection libraries. Ben
Starting from the elf specification, this article discusses in depth the implementation of Linux dynamic connection library, with the aim of further promoting linu
Research and Application of X.
5 Appendix: Linux assembler syntax
The Linux assembler in the X86 architecture is compatible with the syntax of the at&t System V/386 assembler
The syntax is quite different, as shown in the following table:
At&t intel
Constant prefix $: pushl $4 push 4
Register prefix %: % EBX
Jump command (absolute address) prefix *: JMP * Fun
Jump command (relative offset) unlabeled: JMP fun
Objective: Source operand sequence Source: movl $4, % eax objective previous: mov eax, 4
Operand size suffix B, W, L: movl modifier byte PTR, etc.
Address Change Addressing [base + disp] disp (base)
References
[1] executable and linking format spec v1.2, tis Committee, 1995
Http://x86.ddj.com/ftp/manuals/tools/elf.pdf
[2] GNU Project (GCC, libc, binutils), Free Software Foundation, inc., 1999
Http://www.gnu.org/software/
[3] Solaris 2.5 linker and libraries guide, Sun Microsystems Inc., 1999
Http://docs.sun.com/
FTP: // 192.18.99.138/802-1955/802 -1955.pdf
[4] svr4 Abi x86 supplement, the Santa Cruz operation, inc., 1999
Http://www.sco.com/developer/devspecs/abx86-4.pdf
[5] elf: From the programmer's perspective, h j Lu, 1995
Http://metalab.unc.edu/pub/Linux/GCC/elf.ps.gz
[6] Using ld: the GNU linker, s Chamberlain, Cygnus support, 1994
Http://www.gnu.org/manual/ld-2.9.1/ps/ld.ps.gz