Dynamic Connection Library in Linux and its implementation mechanism

Source: Internet
Author: User
Dynamic Connection Library in Linux and its implementation mechanism
 
 
Abstract: This article introduces the advantages of the dynamic connection library, and describes in detail how to use multiple methods to relocate compilers, connectors, and loaders in Linux on the X86 architecture.

Key words: Dynamic Connection Library; LINUX; relocation

The implementation mechanism of DLL under Linux

[Abstract] In this paper, we discuss the advantage of using dynamic linking. we also demonstrate in detail how compiler, linker and loader implement this feature by using several kinds of relocations under nowadays Linux system, especially on x86 ubuntures.

[Keywords] dynamic link library; DLL; LINUX; relocation

The concept of dynamic connection library for Linux and Windows is similar, but the implementation mechanism is different. It introduces the concepts of got table and PLT table, and uses a variety of relocation items to realize "floating code" and achieve better sharing performance. This article discusses these technologies in detail one by one.

This article focuses on the X86 architecture, because (1) among the various operating Linux architectures, x86 is the most popular; (2) Windows operating systems in this architecture are widely known, therefore, it is easier to understand the similar concepts of Linux;

The following table lists synonyms of Windows and Linux:

Windows Linux
Dynamic connection library (DLL) shared object
The name of the target file (. OBJ) usually ends with. O.
Executable File (.exe) executable (the file name has no specific Identifier)
Connector (link.exe) linker Editor (LD)
Loader (exec/loader) dynamic linker (ld-linux.so)
Segment (segment) Section)

Some keywords have specific meanings in this article and need to be clarified:

Compilation unit: a c language source file. After compilation, a target file running module is generated: a dynamic connection library or an executable file. Module for short
Automatic variables and functions: Objects modified by the C language auto keyword
Static variables and functions: Objects modified by the static keyword in C Language
Global variables and functions: Objects modified by the C language extern keyword

1. Advantages of dynamic database connection

Programming generally requires several steps: editing, compiling, connecting, loading, and running. Because some public codes need to be used repeatedly, they are pre-compiled into the target file and saved in the "library. When it is connected to the object file of the user program, the connector must select the Code required by the user program from the library and copy it to the generated executable file. This type of library is called a static library. It features a complete copy of the library code in the executable file. Obviously, when the static library is used by multiple programs, multiple redundant copies are made on the disk and in the memory.

Dynamic Connection Library is used to overcome this defect. When it is connected to the target file of the user program, the connector is only marked, indicating that the program needs to dynamically connect to the library, rather than copying the library code to the executable file; only when the executable file is running, the loader checks whether the library has been loaded into the memory by other executable files based on this tag. If it already exists in the memory, you do not need to load it from the disk, as long as you share the existing code in the memory. In this way, there is always only one piece of code in the disk and memory, which is better than the static database.

2 important features of Linux dynamic Connection Library: Floating code

In Windows, you must specify a first address when generating a dynamic Connection Library. When the application runs sequentially, the loader tries to load the dynamic connection library to this address. If the address is occupied, the dynamic connection library can only be loaded into other address spaces, in this case, we need to fix the code and data in the database, or call it "relocation. As a result, multiple instances in the database will be different from each other after they are located too heavily in the memory, so they will no longer be able to share. To avoid this defect, the Windows libraries all specify addresses that do not overlap with each other. However, the products of other software vendors still
It is inevitable that overlapping addresses are used, and thus the benefits of using dynamic connection libraries are partially lost.

In Linux, the position independent code (PIC) policy is used different from that in Windows to achieve better shared performance ). Specifically, the transfer commands used are the offset relative to the current program counter (IP), and the addresses of variables and functions referenced in the Code are the offset relative to a base address. In short, never reference an absolute address. In this way, the dynamic connection library can work normally no matter what address space it loads. Since there is only one piece of code, it is easy to achieve sharing.

It is worth noting that the sharing mentioned here refers to the unique image of multiple processes using dynamic connection library code segments and read-only data segments in the memory in order to save storage; another common definition of sharing, multiple processes read and write the same block (which may be dynamically allocated) in the bucket to implement inter-process communication (IPC ). When the latter shared definition is not run with the executable file in this article, the loader checks whether the library has been loaded into the memory by its Executable File Based on this tag. If it already exists in the memory, you do not need to load it from the disk, as long as you share the existing code in the memory. In this way, there is always only one piece of code in the disk and memory, which is better than the static database.

2 important features of Linux dynamic Connection Library: Floating code

In Windows, you must specify a first address when generating a dynamic Connection Library. When the application is running, the loader tries to load the dynamic connection library to this address. If the address is occupied, the dynamic connection library can only be loaded into other address spaces, in this case, we need to fix the code and data in the database, or call it "relocation. As a result, multiple instances in the database will be different from each other after they are located too heavily in the memory, so they will no longer be able to share. To avoid this defect, the Windows libraries all specify addresses that do not overlap with each other. However, the overlapping addresses of products of other software vendors are inevitable, the benefits of using a dynamic Connection Library are partially lost.

In Linux, the position independent code (PIC) policy is used different from that in Windows to achieve better shared performance ). Specifically, the transfer commands used are the offset relative to the current program counter (IP), and the addresses of variables and functions referenced in the Code are the offset relative to a base address. In short, never reference an absolute address. In this way, the dynamic connection library can work normally no matter what address space it loads. Since there is only one piece of code, it is easy to achieve sharing.
It is worth noting that the sharing mentioned here refers to the unique image of multiple processes using dynamic connection library code segments and read-only data segments in the memory in order to save memory; another common sharing definition, multiple processes read and write data in the same (which may be dynamically allocated) storage area to implement inter-process communication (IPC ). The latter sharing definition is irrelevant to this document.

3 Implementation Mechanism of Linux dynamic Connection Library: Relocation

3.1 relocation Overview

Floating code can be implemented through the relocation operation. Relocation can be classified according to multiple criteria:

-- Based on the location where the data segment (. Text) is located and relocated to the Data Segment (. Data.

-- Based on the occurrence time, it can be divided into connection-time relocation and loading-time relocation (dynamic relocation is also called loading-time relocation ). However, these two steps are not always indispensable. For example, to implement floating code, you cannot dynamically relocate the code segment. In this case, you can move the items to be dynamically relocated to the data segment, and then reference these items in the code segment.

-- Objects referenced by relocation items can be divided into data reference and function reference. If static data or functions are referenced, the connector optimizes the generated code and removes the dynamic relocation item.

-- Literally, Linux in the X86 architecture uses multiple relocation methods, with the prefix "r_386 _" followed: 32. got32, plt32, copy, glob_dat, jmp_slot, relative, gotoff, and gotpc. Each method has a specific meaning.
The most important of the above categories is by location. In the following sections, we will take it as the main line to introduce various relocation methods one by one.
. First, two key concepts are introduced: Got table and PLT table.

3.2 got table

Each item in the got (Global Offset Table) table is the address of a global variable or function to be referenced by this running module. You can use the got table to indirectly reference global variables and functions, or use the first address of the got table as a baseline, and use the offset relative to this benchmark to reference static variables and functions. Since the loader does not load the running module to a fixed address, the absolute address and relative location of each running module are different in the address space of different processes. This difference is reflected in the got table, that is, each running module of each process has an independent got table, so the got table cannot be shared between processes.

In the X86 architecture, the first address of the got table in this running module is always stored in the % EBX register. The compiler generates a small piece of code at each function entry to initialize the % EBX register. This step is necessary. No, if the call to this function comes from another operation module, % EBX is the got table address of the caller module; if % EBX is not reinitialized, it is used to reference global variables and functions. Of course, an error occurs.

3.3 PLT table

Each item in the PLT (Procedure linkage table) table is a small piece of code, which corresponds to a global function to be referenced in this running module. Taking the call to function fun as an example, the code snippet in PLT is as follows:

. Pltfun: JMP * Fun @ Got (% EBX)
Pushl $ offset
JMP. plt0 @ PC

The referenced got table item is initialized by the loader as the address of the next command (pushl), so the JMP command is equivalent to the NOP empty command.

A call [email] Fun @ PLT command is generated after the user calls the fun directly in the program. This is a relative jump command (meeting the floating code requirements !), Jump to. pltfun. If this is the first time this function is called in this running module, the JMP here is equal to an empty command, continue to execute, and then jump to PLT [email] 0. This PLT item is reserved for the additional code generated by the compiler and will introduce the program flow into the loader. The loader calculates the actual entry address of fun and fills in the fun @ got table item. The figure is as follows:

User Program
--------------
Call fun @ PLT
|
V
Dll plt table Loader
---------------------------------------------------
Fun: <-- JMP * Fun @ Got --> change got entry from
| $ Loader to $ fun,
V then jump to there
Got table
--------------
Fun @ gotloader

After the first call, the got table item has pointed to the correct entry of the function. In the future, we will call this function again. After jumping to the PLT table, we will no longer enter the loader. Instead, we will jump into the correct function entry. From the performance analysis, the loader must perform some additional processing only for the first call, which is totally tolerable. We can also see that the code segment does not need to be patched during loading, so the entire code segment can be shared between processes.

Familiar with Windows programmers, it is easy to notice that got tables and PLT tables are similar to import tables in windows. Other mappings are as follows: Linux version script and windows. Def files; Linux dynamic symbols section and Windows output table (export ). I will not give more examples.

3.4 code segment relocation

It should be noted that, as required by floating code, there should be no relocation items in the code segment. The phrase "in the code segment" is used here. The actual relocation item is still in the got table of the Data Segment. However, the difference between it and section 3.5 "relocation in data segments" is obvious.

A) first address of the mounted got table

To use a got table, you must first know its first address. However, the first address varies with the first address loaded by the running module. Linux uses a technique to obtain the correct first address of the got table during running. The code snippet is as follows. Next, the corresponding types of relocation items in the target file (. O) and dynamic Connection Library (. So) are listed:

Call L1
L1: popl % EBX
Addl $ got + [.-. L1], % EBX
. O: r_0000_gotpc
. So: NULL

As mentioned above, the code snippet exists at the entrance of each function. The first sentence of the program pushes the current program counter (IP) value into the stack, and the second sentence pops it out of the stack. The result is equivalent to movl % EIP, % EBX, only valid x86 instruction sets do not allow % EIP as the operand. Then, in the third sentence, add % EBX to a got table and the IP value difference. The difference is a constant irrelevant to the dynamic Connection Library Loading the first address, which can be obtained during the connection. The entire process is described in C language:

% EBX = % EIP;
% EBX + = ($ got-% EIP)

At this point, % EBX is equal to the first address of the got table.

The above process is the result of compilation and connection cooperation. When the compiler generates the target file, because there is no got table at this time (each running module has a got table, a PLT table, generated by the connector ), therefore, the difference between the got table and the current IP address cannot be calculated temporarily. Only the r_1__gotpc relocation tag is set in the third sentence. Then connect. The connector notices that the gotpc relocation item is used to calculate the difference between the got and the IP address, and serves as the immediate addressing method operand of the addl command. You do not need to relocate it any more.

B) reference variable and function address

When you reference static variables, static functions, or string constants, you can use r_1__gotoff to relocate them. It is similar to the gotpc relocation method. Similarly, the compiler first sets the position marker in the target file, and then the connector calculates the difference between the got table and the first address of the referenced element, acts as the address change addressing method of the Leal command. The code snippet is as follows:

Leal. LC1 @ gotoff (% EBX), % eax
. O: r_0000_gotoff
. So: NULL

When a global variable or global function is referenced, the compiler sets the r_1__got32 relocation tag in the target file. The connector retains one item in the got table and adds the r_1__glob_dat relocation tag to the loader to fill in the actual address of the referenced element. The connector also calculates the offset of the reserved entry in the got table as the address change addressing method operand of the movl command. The code snippet is as follows:

Movl x @ Got (% EBX), % eax
. O: r_0000_got32
. So: r_1__glob_dat

It should be noted that when a global function is referenced, The got table reads the entry of the function in the PLT table instead of the actual entry address of the global function. pltfun (see section 3.3 ). In this way, no matter whether it is directly called or the function address is obtained first and then called indirectly, the program flow will be transferred to the PLT table, and the control will be transferred to the loader. The loader uses this opportunity for dynamic connection.

C) directly call a function

As mentioned above, function call statements in floating code are compiled into relative jump commands. First, the compiler sets a r_1__plt32 relocation tag in the target file, and the connection process varies depending on the static function and global function.

If it is a static function, the call must come from the same running module. The offset between the call point and the function entry point can be calculated during connection and used as the Redirect operand of the Call Command relative to the current IP address offset, this directly enters the function entry, so you don't have to worry about the loader. The related code snippets are as follows:

Call f @ PLT
. O: r_1__plt32
. So: NULL

If it is a global function, the connector is generated. as described in section 3.3, the first call to a global function will transfer the program flow to the loader, and then calculate the function entry address, fill in the fun @ got table item. This is called r_1__jmp_slot relocation. The related code snippets are as follows:

Call f @ PLT
. O: r_1__plt32
. So: r_1__jmp_slot

As a result, a global function may have up to two relocation items. One is the required jmp_slot relocation item, and the loader points it to the real function entry; the other is the glob_dat relocation item, which points it to the code snippet In the PLT table. When getting the function address, it always gets the value of the glob_dat relocation item, that is, pointing to. pltfun, rather than the real function entry.

Further consider the following problem: Two dynamically connected libraries, take the address of the same global function, and compare the two results. We can see from the previous discussion that neither of the two results points to the real entry of the function, but to two different PLT tables respectively. A simple comparison may lead to an "Unequal" conclusion, which is obviously incorrect, so special processing is required.

3.5 Data Segment relocation

In the data segment, relocation refers to the initialization of static and global variables of the pointer type. It is at least different from the relocation in the code segment: 1. It must be completed before the user program obtains control (the main function starts to execute); 2. It does not go through the got table indirect addressing, this is because % EBX does not have the correct first address of the got table. 3. Modify the data segment directly. The code segment cannot be modified when the code segment is relocated.

If static variables, functions, and string constants are referenced, the compiler sets the r_1__32 relocation tag in the target file and calculates the offset of the referenced variable and function relative to the first address of the field. The connector changes it to the r_1__relative relocation tag to calculate its offset from the first address (usually zero) of the dynamic Connection Library. The loader adds the real first address (not zero) of the runtime module to the offset, and the result is used to initialize the pointer variable. The code snippet is as follows:

. Section. rodata
. Lc0:. String "OK/N"
. Data
P:. Long. lc0
. O: r_0000_32 w/Section
. So: r_1__relative

If global variables and functions are referenced, the compiler also sets the r_1__32 relocation tag and records the referenced symbol name. The connector does not need to be operated. Finally, the loader finds the referenced symbol, and the result is used to initialize the pointer variable. For global functions, the search result is still the code snippet of the function in the PLT table, rather than the actual entry. This is the same as the previous reference to global functions. The code snippet is as follows:

. Data
P:. Long printf
. O: r_0000_32 w/Symbol
. So: r_0000_32 w/Symbol

3.6 conclusion

The following table shows all the results discussed above:

. O. So
------------------------------------------------------------
| First address of the mounted got table r_0000_gotpc null
Code segment | -----------------------------------------------------
Relocation | reference variable function address static r_386_gotoff null
| Global r_1__got32 r_1__glob_dat
| -----------------------------------------------------
| Directly call the function static r_386_plt32 null
| Global r_1__plt32 r_1__jmp_slot
------ | -----------------------------------------------------
Data Segment | reference variable function address static r_1__32 w/sec r_1__relative
Relocation | global r_1__32 w/sym r_1__32 w/sym
------------------------------------------------------------

4 Conclusion

Windows uses the PE file format, while Linux uses the ELF file format, which is the root cause of different dynamic connection libraries. Starting from the elf specification, this article discusses in depth the specific implementation of the Linux dynamic connection library, with the aim of further promoting the research and application of Linux.

5 Appendix: Linux assembler syntax

The Linux assembler in the X86 architecture is compatible with the syntax of the at&t System V/386 assembler, which is quite different from the common intel syntax, as shown in the following table:

At&t intel
Constant prefix $: pushl $4 push 4
Register prefix %: % EBX
Jump command (absolute address) prefix *: JMP * Fun
Jump command (relative offset) unlabeled: JMP fun
Objective: Source operand sequence Source: movl $4, % eax objective previous: mov eax, 4
Operand size suffix B, W, L: movl modifier byte PTR, etc.
Address Change Addressing [base + disp] disp (base)

References

[1] executable and linking format spec v1.2, tis Committee, 1995
Http://x86.ddj.com/ftp/manuals/tools/elf.pdf
[2] GNU Project (GCC, libc, binutils), Free Software Foundation, inc., 1999
Http://www.gnu.org/software/
[3] Solaris 2.5 linker and libraries guide, Sun Microsystems Inc., 1999
Http://docs.sun.com/
FTP: // 192.18.99.138/802-1955/802 -1955.pdf
[4] svr4 Abi x86 supplement, the Santa Cruz operation, inc., 1999
Http://www.sco.com/developer/devspecs/abx86-4.pdf
[5] elf: From the programmer's perspective, h j Lu, 1995
Http://metalab.unc.edu/pub/Linux/GCC/elf.ps.gz
[6] Using ld: the GNU linker, s Chamberlain, Cygnus support, 1994
Http://www.gnu.org/manual/ld-2.9.1/ps/ld.ps.gz

Original article addressHttp://bbs.gd-linux.org/viewthread.php? Tid = 295

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.