Great loss's blind Kan (i.)

Last Update:2016-06-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The former slightly lifted the PE file format, this time briefly said PE loading, PE file, all segments of the starting address are the page integer times, the length of the segment if not

The integer times of the page, that will be mapped when the page up to the integer times, the PE file, the connector in the production of executables, often will all the segments as much as possible to merge, so generally

Only a few segments, such as code snippets, data segments, read-only data segments, and BSS.

In the term PE, there is a relative virtual address concept, in fact when compared with the offset in the file. It is an offset address relative to the Mount base site of the PE file. If a PE file

is loaded to the virtual address 0x00400000, then the virtual offset address of 0x1000 address is 0x00401000, each PE file will be loaded with

A mount destination address, which is the so-called base address. The process for loading a PE executable file is as follows:

First page of the file is read, this page contains the DOS header, PE file header and Cong

Check whether the destination address is available in the process address space

Use the information provided in the segment table to map all segment one by one in the PE file to the corresponding location in the address space

If the mount address is not the destination address, the rebasing

DLL files required to load all PE files

Parsing all import symbols in a PE file

Based on the parameters specified in the PE header, the resume initializes the stack and the heap

Establish the main thread and start the process

In the PE file, the information associated with the load is contained in the PE expansion head and segment table, which is structured as follows, only 32 bits are analyzed

typedef struct _image_optional_header32{

WORD Magic;

BYTE majorlinkerversion,minorlinkerversion;

DWORD Sizeofcode;

DWORD Sizeofinitializeddata; ///initialized data segment length

DWORD Sizeofuninitializeddata; or/or uninitialized data segment length

DWORD Addressofentrypoint; ////pe The RVA of the first instruction of the PE file that the loader is ready to run

DWORD Baseofcode; ////Code snippet start RVA

DWORD Baseofdata; ////data segment Start RVA

DWORD ImageBase; Priority loading address for////pe files

DWORD sectionalignment; ////The granularity of mid-memory alignment, typically 4096

DWORD FileAlignment; ////The granularity of the middle alignment of a file, typically 512 bytes

WORD majoroperatingsystemversion;

WORD minoroperatingsystemversion;

WORD majorimageversion;

WORD minorimageversion;

WORD majorsubsystemversion; ////The subsystem version required to run the program

WORD minorsubsystemversion;

DWORD Win32versionvalue;

DWORD Sizeofimage; ////The size of the entire PE image in memory

DWORD sizeofheaders; ////the size of all Header + section tables, equal to the file size minus the dimensions of all sections in the file

DWORD CheckSum;

WORD subsystem; ////nt is used to identify which subsystem the PE file belongs to, GUI and Cui

DWORD Sizeofstackreserve;

DWORD Sizeofstackcommit;

DWORD Sizeofheapreserve;

DWORD Sizeofheapcommit;

DWORD Loaderflags;

Image_data_directory datadirectory[16];

}image_optional_header32,*pimage_optional_header32

typedef struct _image_data_directory{

DWORD virtualaddress;

DWORD Size;

}image_data_directory,*pimage_data_directory;

About Dynamic Links

The simple way to solve the problem of space wasting and updating is to separate the modules of the program from each other, to form independent files, and not to connect them statically.

In simple terms, it is not necessary to connect the target files that make up the program until the program is running, that is to say, postpone the connection to the runtime

In progress, this is the dynamic connection

The idea of a dynamic connection is to split the program into separate parts that are separated by modules and connect them together to form a complete program while the program is running, rather than

Connect all the program modules to a single executable file like a static link. In other words, dynamic linking delays the connection process from the original program loading until the

At the time of loading

In a static link, the entire program eventually has only one executable file, it is a non-segmented whole, but under dynamic connection, a program is divided into a number of files

With the main part of the program, you can execute the shared objects that the files and programs depend on, many times called modules.

/* PROGRAM1.C */

#include "Lib.h"

int main ()

{

Show (1);

return 0;

}

/* PROGRAM2.C */

#include "Lib.h"

int main ()

{

Show (2);

return 0;

}

/* LIB.C */

#include <stdio.h>

void Show (int i)

{

printf ("Printing from lib.so%d\n", i);

}

/** Lib.h **/

#ifndef Lib_h

#define Lib_h

void Show (int i);

#endif

Here is the virtual address space distribution of the process runtime

$cat/proc/12985/maps

08048000-08049000 r-xp00000000 08:011343422 ./program1

08049000-0804a000 rwxp00000000 08:011343432 ./pragram1

b7e83000-b7e84000 rwxpb7e83000 00:000

b7e84000-b7fc8000 r-xp00000000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so

b7fc80000-b7fc9000 r-xp00143000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so

b7fc9000-b7fce000 r-xp00144000 08:011488993 /lib/tls/i686/cmov/ Libc-2.6.1.so

b7fcb000-b7fce000 rwxpb7fcb000 00:000

b7fd8000-b7fd9000 rwxpb7fd8000 00:000

b7fd9000-b7fda000 r-xp00000000 08:011343290 ./lib.so

b7fda000-b7fdb000 rwxp00000000 08:011343290 ./lib.so

b7fdb000-b7fdd000 rwxpb7fdb000 00:000

b7fdd000-b7ff7000 r-xp00000000 08:011455332 /lib/ld-2.6.1.so

b7ff7000-b7ff9000 rwxp00019000 08:011455332 /lib/ld-2.6.1.so

bf965000-bf97b000 we-pbf965000 00:000 [stack]

ffffe000-fffff000 r-xp00000000 00:000 [VDSO]

You can see that the entire process virtual address space has more mappings for several files. Lib.so, like Program1, is the virtual address space that is mapped to the process by the operating system in the same way.

About ld-2.6.so, the dynamic connector under the actual delete format Linux, the dynamic connector is mapped to the address space of the process just like the normal shared object, before the system starts to run PROGRAM1.

First, the control is given to the dynamic connector, he completes all the dynamic link work and then give control to Program1, and then start execution.

The final link of the shared object is not deterministic at compile time, but at load time, the loader dynamically allocates a block of virtual address space based on the idle situation of the current address space to

The corresponding shared object.

In fact, the instruction and data of the program module may contain some absolute address references, when the connection produces the output file, it is necessary to assume that the module is loaded to the destination address. But the shared object is

You cannot assume your location in the process virtual address space at compile time. In contrast, an executable can basically determine where to start in a process virtual space because the executable file is often

is the first file to be loaded, it can select a fixed idle address. Relocation at the time of connection is referred to as the relocation of the connection, and at this time, the module address needs to be relocated when loading

, which we call relocation when loading. And Windows is also known as the base address reset.

Load-time relocation while resolving the absolute address reference in a dynamic module, but making the instruction part impossible to share between multiple processes, we want the part of the instruction shared in the program module to

Loading does not need to change depending on the load address, so you need to separate the parts of the instruction that need to be modified, and to put together the data parts so that the instruction part can be kept

Unchanged, and the data part can have a copy in each process, which is known as address-independent code pic.

In fact, it is not troublesome to generate address-independent code, first of all, the modules of the various types of address references across modules are divided into two categories: module internal references and module external references: According to different references

The method can also be divided into instruction reference and data access. There are four different cases at this time

function calls, jumps, etc. inside the module

Data access within the module, such as global variables, static variables defined in the module

function calls, jumps, and so on outside the module.

Data access outside the module, such as global variables defined in other modules.

static int A;

extern int B;

extern void ext ();

void Bar ()

{

A = 1; Module internal data access

b = 2; ///module external data access

}

void Foo ()

{

Bar (); ///module internal function access

Ext (); ///module external function access

}

When the compiler compiles this file, it is virtually impossible to determine whether the variable B and function ext () are external to the module or inside the module, as they may be defined in other target files of the same shared object

In Because there is no certainty, the compiler can only treat them as external functions and variables. The MSVC compiler provides a __declspec (dllimport) extension to identify a symbol that is inside the module

is still external to the module.

In the first case, for the called function and the caller are in the same module, the relative position between them is fixed, so that the jump inside the module, function calls can be relative address

Call, or a relative call based on a register, so there is no need to relocate this instruction.

<bar>:

8048344: push%EBP

8048345: e5mov%esp,%ebp

8048347: 5dPop%EBP

8048348: c3ret

8048349: <foo>:

......

8048357: e8 E8 FFFF FF call 8048344 <bar>

804835C: B8/xx/mov $0x0,%eax

......

The call to bar in Foo is actually a relative address call instruction, and the last 4 bytes in this instruction are the offsets of the destination address relative to the next instruction in the current instruction, i.e.

0xffffffe8,0xffffffe8 is a 24 complement form, that is, bar's address is 0x804835c-24 = 0x8048344 as long as the bar and Foo's relative position is unchanged, this directive is address-independent,

This method of relative address is also valid for JMP directives.

Obviously, the directive does not directly contain the absolute address of the data, the only way is to use a relative address, a module is usually a number of pages of code, followed by a number of pages of data

The relative position between these pages is fixed, so that the relative position of any instruction with the internal data of the module it needs to access is fixed, only with respect to the current instruction plus a fixed

The offset provides access to the data inside the module.

00000044c <bar>:

44c: push%EBP

44D: e5 mov%esp,%ebp

44f: E8 -494 <__i686.get_pc_thunk.cx>

454: Bayi C1 8c One xxadd $0x118c,%ecx

45a: C7 bayimovl $0x1,0x28 (%ECX)

461: 00 00 00

464: 8b Bayi FB FF FF FFmov 0xfffffff8 (%ecx),%eax

46a: C7 xxmovl $0x2, (%EAX)

470: 5d Pop%EBP

471: c3 ret

00000494 <__i686.get_pc_thunk.cx>

494: 8b 0cmov (%ESP),%ecx

497: c3 ret

When the processor executes the call instruction, the address of the next instruction is pressed to the top of the stack, and the ESP register always points to the top of the stack when "__i686.get_pc_thunk.cx" executes "mov (%ESP),%ecx"

, the return address is assigned to the ECX register.

Then execute an add and a MOV, you can see the traverse a address is the add instruction address (saved in the ECX register) plus another offset 0x118c and 0x28, that is, if the module is loaded into

0x10000000 this address, then the actual address of variable A is 0x100000000 + 0x454 +0x118c + 0x28 = 0x10001608

|--------------------------------------------------------------------------------0x00000000

|---------------------------------------------------------------------------------0x10000000

|--------- | 44f: E8 -494 <__i686.get_pc_thunk.cx>

| | 454:Bayi C1 8c One xx add $0x118c,%ecx

| | 45a:C7 bayi movl $0x1,0x28 (%ECX)

| | 461:00 00 00

| | . Text

0x118c + 0x28 |

| |

| |--------------------------------------------------------------------------------------

| |

|--------- |static int A;

| . Data

|----------------------------------------------------------------------------------------

Data access between modules, such as variable B, is defined in other modules, and the address is not determined at the time of loading, making the code address irrelevant, basic

The idea is to put the address-related parts into the data segment, it is clear that these other modules of the global variable address is related to the module loading address, at this time in the data segment to establish a

An array of pointers to these variables, also known as global offset table got, can be indirectly referenced by the corresponding item in got when the code needs to reference the global variable. When access is required in the instruction

Variable B, the program first find got, at this time according to the variables in the got corresponding to find the variable's wooden plaque address, each variable corresponding to a 4-byte address, the connector in the loading module

Finds the address where each variable is located, populating each item in the got to ensure that each pointer points to the correct address. Since the got itself is placed in the data segment, it can be modified when the module is loaded

, and each process can have a separate copy.

The module at compile time can determine the module internal variables relative to the current instruction offset, then we can also determine at compile time got relative to the current instruction offset is to determine the position of got, and then according to

Variable address is offset in got to get the address of the variable.

But what about global variables that are defined inside the module? For example, a shared object defines a global variable, which is referenced in the module MODULE.C.

extern int global;

int foo ()

{

global = 1;

}

When the compiler compiles module.c, it is not possible to determine whether global is defined in other target files of the same module or in another shared object.

Whether it is called across modules, that is, cannot be judged by got or in the local executable. BSS. At this point we point all the instructions that use this variable to the executable file

The copy in the. Elf shared library at compile time, the default is to define the global variables within the module as global variables defined in other modules, through got to implement variable access. When sharing a module

When loaded, if a global variable has a copy in the executable, the dynamic linker points the corresponding address in the got to that copy, so that the variable is actually eventually only

An instance. If a variable is initialized in a shared module, the dynamic connector also needs to copy the initialization value to a copy of the variable in the program's main module. If the global variable is in the program Master module

Without a copy, the corresponding address of the got points to the copy of the variable inside the module.

If a global variable g is defined in lib.so, and process A and process B both use lib.so, will process B be affected when process a changes g?

No, when lib.so is loaded by two processes, its data segment part has a separate copy in each process, and the global variables in the shared object actually and the global variables defined inside the program

It makes no difference that any one process accesses only that copy without affecting other processes, but if it is thread A and thread B of the same process, it is affected at this time. This time windows

There is a dedicated term thread for private storage (thread Local Storage).

At this point, we should compare the difference between static link and dynamic connection, the main reason for dynamic connection is slower than static link is the load of the dynamic link for both global and static data access got

positioning, and indirect addressing, for the module between the call also advanced got, and then indirectly jump. As a result, the speed of the program will inevitably be affected. At the same time, the dynamic Link connection works

At run time, that is, when the program executes, it also makes a connection work, loads the required shared objects, and then performs the symbolic lookup address relocation. For the second case, based on the shared object

Many of the functions are not used, and it is actually a waste to connect all the functions in the first place. So at this point, using a lazy binding approach, the basic idea is that the function is the first time

Bind (symbol lookup, positioning) when used

Elf is implemented using the PLT (Procedure Linkage Table) method. Assuming that liba.so needs to call the bar function in libc.so, the dynamic connector is required when liba.so first calls the bar function

To complete the address binding work, we assume that lookup () queries the bar address, at which point the lookup needs to know which module the address binding occurs in and which function.

Lookup (module,function), when invoking the function of an external module, usually uses the corresponding item in the got to jump indirectly, the PLT in order to implement delay binding, but also added a layer of indirect jump.

At this point, each external function has a corresponding entry in the PLT, for example bar's entry address is [email protected]. Implemented as follows

[Email protected]:

JMP * ([email protected])

Push n

Push ModuleID

Jump _dl_runtime_resolve

Obviously, the effect of the first instruction is to jump to the second instruction, and the second instruction will press n into the stack, which is the bar symbol referenced in the Reposition table. REL.PLT subscript, then

It's going to push the moduleid into the stack again and jump to _dl_module_resolve.

This is actually a call to lookup (module,function).

In fact, the PLT is really more complex to achieve, Elf will got split into. Got and. Got.plt, where. Got is used to hold global variable reference addresses, and. GOT.PLT the address used to hold the function, for external functions

The reference part is separated into the. Got.plt, and the first three items of GOT.PLT are as follows:

The first item saved is the address of the. Dynamic segment, which describes the information related to this module's dynamically connected

The second item saves the ID of this module, and the third item holds the address of the _dl_runtime_resolve.

Great loss's blind Kan (i.)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Great loss's blind Kan (i.)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Great loss's blind Kan (i.)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support