The former slightly lifted the PE file format, this time briefly said PE loading, PE file, all segments of the starting address are the page integer times, the length of the segment if not
The integer times of the page, that will be mapped when the page up to the integer times, the PE file, the connector in the production of executables, often will all the segments as much as possible to merge, so generally
Only a few segments, such as code snippets, data segments, read-only data segments, and BSS.
In the term PE, there is a relative virtual address concept, in fact when compared with the offset in the file. It is an offset address relative to the Mount base site of the PE file. If a PE file
is loaded to the virtual address 0x00400000, then the virtual offset address of 0x1000 address is 0x00401000, each PE file will be loaded with
A mount destination address, which is the so-called base address. The process for loading a PE executable file is as follows:
First page of the file is read, this page contains the DOS header, PE file header and Cong
Check whether the destination address is available in the process address space
Use the information provided in the segment table to map all segment one by one in the PE file to the corresponding location in the address space
If the mount address is not the destination address, the rebasing
DLL files required to load all PE files
Parsing all import symbols in a PE file
Based on the parameters specified in the PE header, the resume initializes the stack and the heap
Establish the main thread and start the process
In the PE file, the information associated with the load is contained in the PE expansion head and segment table, which is structured as follows, only 32 bits are analyzed
typedef struct _image_optional_header32{
WORD Magic;
BYTE majorlinkerversion,minorlinkerversion;
DWORD Sizeofcode;
DWORD Sizeofinitializeddata; ///initialized data segment length
DWORD Sizeofuninitializeddata; or/or uninitialized data segment length
DWORD Addressofentrypoint; ////pe The RVA of the first instruction of the PE file that the loader is ready to run
DWORD Baseofcode; ////Code snippet start RVA
DWORD Baseofdata; ////data segment Start RVA
DWORD ImageBase; Priority loading address for////pe files
DWORD sectionalignment; ////The granularity of mid-memory alignment, typically 4096
DWORD FileAlignment; ////The granularity of the middle alignment of a file, typically 512 bytes
WORD majoroperatingsystemversion;
WORD minoroperatingsystemversion;
WORD majorimageversion;
WORD minorimageversion;
WORD majorsubsystemversion; ////The subsystem version required to run the program
WORD minorsubsystemversion;
DWORD Win32versionvalue;
DWORD Sizeofimage; ////The size of the entire PE image in memory
DWORD sizeofheaders; ////the size of all Header + section tables, equal to the file size minus the dimensions of all sections in the file
DWORD CheckSum;
WORD subsystem; ////nt is used to identify which subsystem the PE file belongs to, GUI and Cui
DWORD Sizeofstackreserve;
DWORD Sizeofstackcommit;
DWORD Sizeofheapreserve;
DWORD Sizeofheapcommit;
DWORD Loaderflags;
Image_data_directory datadirectory[16];
}image_optional_header32,*pimage_optional_header32
typedef struct _image_data_directory{
DWORD virtualaddress;
DWORD Size;
}image_data_directory,*pimage_data_directory;
About Dynamic Links
The simple way to solve the problem of space wasting and updating is to separate the modules of the program from each other, to form independent files, and not to connect them statically.
In simple terms, it is not necessary to connect the target files that make up the program until the program is running, that is to say, postpone the connection to the runtime
In progress, this is the dynamic connection
The idea of a dynamic connection is to split the program into separate parts that are separated by modules and connect them together to form a complete program while the program is running, rather than
Connect all the program modules to a single executable file like a static link. In other words, dynamic linking delays the connection process from the original program loading until the
At the time of loading
In a static link, the entire program eventually has only one executable file, it is a non-segmented whole, but under dynamic connection, a program is divided into a number of files
With the main part of the program, you can execute the shared objects that the files and programs depend on, many times called modules.
/* PROGRAM1.C */
#include "Lib.h"
int main ()
{
Show (1);
return 0;
}
/* PROGRAM2.C */
#include "Lib.h"
int main ()
{
Show (2);
return 0;
}
/* LIB.C */
#include <stdio.h>
void Show (int i)
{
printf ("Printing from lib.so%d\n", i);
}
/** Lib.h **/
#ifndef Lib_h
#define Lib_h
void Show (int i);
#endif
Here is the virtual address space distribution of the process runtime
$cat/proc/12985/maps
08048000-08049000 r-xp00000000 08:011343422 ./program1
08049000-0804a000 rwxp00000000 08:011343432 ./pragram1
b7e83000-b7e84000 rwxpb7e83000 00:000
b7e84000-b7fc8000 r-xp00000000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so
b7fc80000-b7fc9000 r-xp00143000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so
b7fc9000-b7fce000 r-xp00144000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so
b7fcb000-b7fce000 rwxpb7fcb000 00:000
b7fd8000-b7fd9000 rwxpb7fd8000 00:000
b7fd9000-b7fda000 r-xp00000000 08:011343290 ./lib.so
b7fda000-b7fdb000 rwxp00000000 08:011343290 ./lib.so
b7fdb000-b7fdd000 rwxpb7fdb000 00:000
b7fdd000-b7ff7000 r-xp00000000 08:011455332 /lib/ld-2.6.1.so
b7ff7000-b7ff9000 rwxp00019000 08:011455332 /lib/ld-2.6.1.so
bf965000-bf97b000 we-pbf965000 00:000 [stack]
ffffe000-fffff000 r-xp00000000 00:000 [VDSO]
You can see that the entire process virtual address space has more mappings for several files. Lib.so, like Program1, is the virtual address space that is mapped to the process by the operating system in the same way.
About ld-2.6.so, the dynamic connector under the actual delete format Linux, the dynamic connector is mapped to the address space of the process just like the normal shared object, before the system starts to run PROGRAM1.
First, the control is given to the dynamic connector, he completes all the dynamic link work and then give control to Program1, and then start execution.
The final link of the shared object is not deterministic at compile time, but at load time, the loader dynamically allocates a block of virtual address space based on the idle situation of the current address space to
The corresponding shared object.
In fact, the instruction and data of the program module may contain some absolute address references, when the connection produces the output file, it is necessary to assume that the module is loaded to the destination address. But the shared object is
You cannot assume your location in the process virtual address space at compile time. In contrast, an executable can basically determine where to start in a process virtual space because the executable file is often
is the first file to be loaded, it can select a fixed idle address. Relocation at the time of connection is referred to as the relocation of the connection, and at this time, the module address needs to be relocated when loading
, which we call relocation when loading. And Windows is also known as the base address reset.
Load-time relocation while resolving the absolute address reference in a dynamic module, but making the instruction part impossible to share between multiple processes, we want the part of the instruction shared in the program module to
Loading does not need to change depending on the load address, so you need to separate the parts of the instruction that need to be modified, and to put together the data parts so that the instruction part can be kept
Unchanged, and the data part can have a copy in each process, which is known as address-independent code pic.
In fact, it is not troublesome to generate address-independent code, first of all, the modules of the various types of address references across modules are divided into two categories: module internal references and module external references: According to different references
The method can also be divided into instruction reference and data access. There are four different cases at this time
function calls, jumps, etc. inside the module
Data access within the module, such as global variables, static variables defined in the module
function calls, jumps, and so on outside the module.
Data access outside the module, such as global variables defined in other modules.
static int A;
extern int B;
extern void ext ();
void Bar ()
{
A = 1; Module internal data access
b = 2; ///module external data access
}
void Foo ()
{
Bar (); ///module internal function access
Ext (); ///module external function access
}
When the compiler compiles this file, it is virtually impossible to determine whether the variable B and function ext () are external to the module or inside the module, as they may be defined in other target files of the same shared object
In Because there is no certainty, the compiler can only treat them as external functions and variables. The MSVC compiler provides a __declspec (dllimport) extension to identify a symbol that is inside the module
is still external to the module.
In the first case, for the called function and the caller are in the same module, the relative position between them is fixed, so that the jump inside the module, function calls can be relative address
Call, or a relative call based on a register, so there is no need to relocate this instruction.
<bar>:
8048344: push%EBP
8048345: e5mov%esp,%ebp
8048347: 5dPop%EBP
8048348: c3ret
8048349: <foo>:
......
8048357: e8 E8 FFFF FF call 8048344 <bar>
804835C: B8/xx/mov $0x0,%eax
......
The call to bar in Foo is actually a relative address call instruction, and the last 4 bytes in this instruction are the offsets of the destination address relative to the next instruction in the current instruction, i.e.
0xffffffe8,0xffffffe8 is a 24 complement form, that is, bar's address is 0x804835c-24 = 0x8048344 as long as the bar and Foo's relative position is unchanged, this directive is address-independent,
This method of relative address is also valid for JMP directives.
Obviously, the directive does not directly contain the absolute address of the data, the only way is to use a relative address, a module is usually a number of pages of code, followed by a number of pages of data
The relative position between these pages is fixed, so that the relative position of any instruction with the internal data of the module it needs to access is fixed, only with respect to the current instruction plus a fixed
The offset provides access to the data inside the module.
00000044c <bar>:
44c: push%EBP
44D: e5 mov%esp,%ebp
44f: E8 -494 <__i686.get_pc_thunk.cx>
454: Bayi C1 8c One xxadd $0x118c,%ecx
45a: C7 bayimovl $0x1,0x28 (%ECX)
461: 00 00 00
464: 8b Bayi FB FF FF FFmov 0xfffffff8 (%ecx),%eax
46a: C7 xxmovl $0x2, (%EAX)
470: 5d Pop%EBP
471: c3 ret
00000494 <__i686.get_pc_thunk.cx>
494: 8b 0cmov (%ESP),%ecx
497: c3 ret
When the processor executes the call instruction, the address of the next instruction is pressed to the top of the stack, and the ESP register always points to the top of the stack when "__i686.get_pc_thunk.cx" executes "mov (%ESP),%ecx"
, the return address is assigned to the ECX register.
Then execute an add and a MOV, you can see the traverse a address is the add instruction address (saved in the ECX register) plus another offset 0x118c and 0x28, that is, if the module is loaded into
0x10000000 this address, then the actual address of variable A is 0x100000000 + 0x454 +0x118c + 0x28 = 0x10001608
|--------------------------------------------------------------------------------0x00000000
|
|
|
|
|---------------------------------------------------------------------------------0x10000000
|--------- | 44f: E8 -494 <__i686.get_pc_thunk.cx>
| | 454:Bayi C1 8c One xx add $0x118c,%ecx
| | 45a:C7 bayi movl $0x1,0x28 (%ECX)
| | 461:00 00 00
| | . Text
0x118c + 0x28 |
| |
| |--------------------------------------------------------------------------------------
| |
|--------- |static int A;
|
|
| . Data
|
|----------------------------------------------------------------------------------------
Data access between modules, such as variable B, is defined in other modules, and the address is not determined at the time of loading, making the code address irrelevant, basic
The idea is to put the address-related parts into the data segment, it is clear that these other modules of the global variable address is related to the module loading address, at this time in the data segment to establish a
An array of pointers to these variables, also known as global offset table got, can be indirectly referenced by the corresponding item in got when the code needs to reference the global variable. When access is required in the instruction
Variable B, the program first find got, at this time according to the variables in the got corresponding to find the variable's wooden plaque address, each variable corresponding to a 4-byte address, the connector in the loading module
Finds the address where each variable is located, populating each item in the got to ensure that each pointer points to the correct address. Since the got itself is placed in the data segment, it can be modified when the module is loaded
, and each process can have a separate copy.
The module at compile time can determine the module internal variables relative to the current instruction offset, then we can also determine at compile time got relative to the current instruction offset is to determine the position of got, and then according to
Variable address is offset in got to get the address of the variable.
But what about global variables that are defined inside the module? For example, a shared object defines a global variable, which is referenced in the module MODULE.C.
extern int global;
int foo ()
{
global = 1;
}
When the compiler compiles module.c, it is not possible to determine whether global is defined in other target files of the same module or in another shared object.
Whether it is called across modules, that is, cannot be judged by got or in the local executable. BSS. At this point we point all the instructions that use this variable to the executable file
The copy in the. Elf shared library at compile time, the default is to define the global variables within the module as global variables defined in other modules, through got to implement variable access. When sharing a module
When loaded, if a global variable has a copy in the executable, the dynamic linker points the corresponding address in the got to that copy, so that the variable is actually eventually only
An instance. If a variable is initialized in a shared module, the dynamic connector also needs to copy the initialization value to a copy of the variable in the program's main module. If the global variable is in the program Master module
Without a copy, the corresponding address of the got points to the copy of the variable inside the module.
If a global variable g is defined in lib.so, and process A and process B both use lib.so, will process B be affected when process a changes g?
No, when lib.so is loaded by two processes, its data segment part has a separate copy in each process, and the global variables in the shared object actually and the global variables defined inside the program
It makes no difference that any one process accesses only that copy without affecting other processes, but if it is thread A and thread B of the same process, it is affected at this time. This time windows
There is a dedicated term thread for private storage (thread Local Storage).
At this point, we should compare the difference between static link and dynamic connection, the main reason for dynamic connection is slower than static link is the load of the dynamic link for both global and static data access got
positioning, and indirect addressing, for the module between the call also advanced got, and then indirectly jump. As a result, the speed of the program will inevitably be affected. At the same time, the dynamic Link connection works
At run time, that is, when the program executes, it also makes a connection work, loads the required shared objects, and then performs the symbolic lookup address relocation. For the second case, based on the shared object
Many of the functions are not used, and it is actually a waste to connect all the functions in the first place. So at this point, using a lazy binding approach, the basic idea is that the function is the first time
Bind (symbol lookup, positioning) when used
Elf is implemented using the PLT (Procedure Linkage Table) method. Assuming that liba.so needs to call the bar function in libc.so, the dynamic connector is required when liba.so first calls the bar function
To complete the address binding work, we assume that lookup () queries the bar address, at which point the lookup needs to know which module the address binding occurs in and which function.
Lookup (module,function), when invoking the function of an external module, usually uses the corresponding item in the got to jump indirectly, the PLT in order to implement delay binding, but also added a layer of indirect jump.
At this point, each external function has a corresponding entry in the PLT, for example bar's entry address is [email protected]. Implemented as follows
[Email protected]:
JMP * ([email protected])
Push n
Push ModuleID
Jump _dl_runtime_resolve
Obviously, the effect of the first instruction is to jump to the second instruction, and the second instruction will press n into the stack, which is the bar symbol referenced in the Reposition table. REL.PLT subscript, then
It's going to push the moduleid into the stack again and jump to _dl_module_resolve.
This is actually a call to lookup (module,function).
In fact, the PLT is really more complex to achieve, Elf will got split into. Got and. Got.plt, where. Got is used to hold global variable reference addresses, and. GOT.PLT the address used to hold the function, for external functions
The reference part is separated into the. Got.plt, and the first three items of GOT.PLT are as follows:
The first item saved is the address of the. Dynamic segment, which describes the information related to this module's dynamically connected
The second item saves the ID of this module, and the third item holds the address of the _dl_runtime_resolve.
Great loss's blind Kan (i.)