Hosting PE files

Source: Internet
Author: User
Document directory
  • 1.3.1 hosting PE files

Text/Xuan Hun

    Intermediate Language

In the. NET Framework, the basic structure of public languages uses Cls to bind different languages. By requiring different languages to at least implement the CTS contained in CLs, the public language infrastructure allows different languages to use the. NET Framework. Therefore. in the. NET Framework, all languages (C #, VB. net, effil. net) is finally converted to a common language: Microsoft intermediate language (msil, hereinafter referred to as IL ).

Il is an intermediate language between an advanced language and an Intel-based assembly language. It is an assembly language on the. NET platform. When you compile a. Net Program, the compiler translates the source code into a group of commands that can effectively convert to Local Code and are independent of the CPU. When these commands are executed, the real-time compiler converts them into CPU-specific code. Since CLR supports multiple real-time compilers, the same piece of IL code can be compiled and run on different structures in real time by different compilers.

Il includes commands for loading, storing, and initializing objects and calling methods of objects, as well as commands for arithmetic and logical operations, control flow, Direct Memory Access, exception handling, and other operations. To make the code executable, you must first convert Il to CPU-specific code, which is usually done through the real-time (JIT) compiler. Since CLR provides one or more JIT compilers for each of its supported computer structures, the same group of Il can be compiled and run on any supported structure by JIT.

When the compiler generates Il, it also generates metadata. Metadata describes the types in the Code, including the definition of each type, the signature of each type of member, the referenced members of the Code, and other data used by the runtime during execution. Il and metadata are contained in a portable executable (PE) file. The following describes how to host PE files and metadata.

1.3.1 hosting PE files

PE (portable execute) is a program file on Microsoft Windows operating system, common such as EXE, DLL, ocx, sys, Com. Figure 1-3 shows the standard PE/coff File Header Format.

 

Figure 1-3 standard PE/coff File Header Format

The ms dos header is the genetic content of the DOS system, indicating that an application can run in the DOS environment. Ms dos root (stub) is a piece of code, if the Windows program runs in the DOS environment, this program cannot be run in DOS mode is displayed. At the offset 0x3c, The ms dos header points to the address of the PE identity (PE Signature.

The pe id indicates that the file is a PE file. The value is always 00004550 h, 45 h Represents the character E, and 50 h Represents the character P.

The COFF Header provides the most common information about COFF or executable files.

The PE Header provides the information required by the operating system to load files. This is the most important thing for PE files, including data index tables and sections.

For more information about standard PE files, read this section. CLR extends Traditional PE files. The format of hosted PE files is shown in 1-4.

 

Figure 1-4 formats of hosted PE files

The standard Windows PE file header and COFF (Common Object File Format) header are similar to PE32 and PE32 +. If the file header adopts the PE32 format, the file can run on a 32-bit or 64-bit operating system. If the file header is in PE32 + format, the file can only run on a 64-bit operating system. The PE32 or PE32 + header also contains the file type information: GUI, CUI, or DLL. If the module contains the local CPU code, the PE32 or PE32 + header will contain information about the local CPU code.

The CLR header contains information about the module being hosted. The information includes the version information required by CLR, metadata information of some identifiers, entry methods, metadata location and size information of modules, resource information, strong names, and other information.

Each managed module contains metadata tables. There are two types of metadata tables: one is the metadata table that describes the type description and member description in the source code, and the other is the metadata table that contains the type description and member description referenced by the source code.

The IL code is the intermediate code generated by the compiler. When the program is running, the CLR is responsible for compiling and executing the intermediate code at a cost.

The CLR header is defined in CorHdr. h of the. NET Framework. The code is shown in List 1-4 of the Code.

Code List 1-4 CLR header definition

 1 typedef struct IMAGE_COR20_HEADER 2  3 { 4  5 ULONG cb; 6  7 USHORT MajorRuntimeVersion; 8  9 USHORT MinorRuntimeVersion;10 11 // Symbol table and startup information12 13 IMAGE_DATA_DIRECTORY MetaData;14 15 ULONG Flags;16 17 union {18 19 DWORD EntryPointToken;20 21 DWORD EntryPointRVA;22 23 };24 25 // Binding information26 27 IMAGE_DATA_DIRECTORY Resources;28 29 IMAGE_DATA_DIRECTORY StrongNameSignature;30 31 // Regular fixup and binding information32 33 IMAGE_DATA_DIRECTORY CodeManagerTable;34 35 IMAGE_DATA_DIRECTORY VTableFixups;36 37 IMAGE_DATA_DIRECTORY ExportAddressTableJumps;38 39 IMAGE_DATA_DIRECTORY ManagedNativeHeader;40 41 } IMAGE_COR20_HEADER;

 

The descriptions of each field in the CLR header are shown in Table 1-1. The section information in the PE file will be briefly introduced later. For details about the PE file, see the reference books in the appendix below.

Table 1-1CLR header field description

Offset (Offset)

Size (Size)

Field (Field)

Description (Description)

0

4

Cb

Header Length (bytes)

4

2

MajorRuntimeVersion

Major Number)

6

2

MinorRuntimeVersion

Minor encoding (Minor Number) in the version information required by CLR to run the program)

8

8

MetaData

Relative virtual address (RAV) and metadata size

16

4

Flags

Binary flag combination, including system-related and program call-related information

20

4

EntryPointToken/EntryPointRVA

Metadata identifier of the file entry point. It can be set to 0 for DLL files.

24

8

Resources

Managed Resource size and relative virtual address

32

8

StrongNameSignature

The size and relative offset of the hash data of the current PE. It is used by the loader for binding and version verification.

40

8

CodeManagerTable

The size and relative offset of the Code Manager table. Currently, the reserved field is set to 0.

48

8

VTableFixups

The size and relative virtual address of a group of V-tables

56

8

ExportAddressTableJumps

RVA and size of the output jump address table for C ++. In most cases, the value is 0.

64

8

ManagedNativeHeader

Reserved field for the local image, set to 0

       

The following section describes the header information of ildasm?helloworld.exe. Click view-headers, as shown in figure 1-5.

 

Figure 1-5 View File Header Information

The main code of header information is shown in code list 1-5.

Code List 1-5 HelloWorld.exe header information

----- DOS Header: Magic: 0x5a4dBytes on last page: 0x0090 ...... (Omitted) File addr. of COFF header: 0x0080 ----- COFF/PE Headers: Signature: 0x00004550 ----- COFF Header: Machine: 0x014cNumber of sections: 0x0003Time-date stamp: Unable to symbol table: 0x00000000Number of symbols: 0x000000000000size of optional header: 0x00e0Characteristics: 0x0102 ----- PE Optional Header (32 bit): Magic: 0x010b ...... (Omitted) Di Rectory :...... (Omitted) Table: 0x00000000 [0x00000000] address [size] of Delay Load IAT: 0x00002008 [0x00000048] address [size] of CLR Header :...... (Section information, omitted) Base Relocation Table 0x00002000 Page RVA 0x0000000c Block Size 0x00000002 Number of Entries Entry 1: Type 0x3 Offset 0x000007a0 Entry 2: type 0x0 Offset 0x00000000 Import Address Table DLL: mscoree. dll ...... (Omitted) Delay Load Import Address Table // No data. entry point code: FF 25 00 20 40 00 ----- CLR Header: Header size: 0x00000048Major runtime version: 0x0002Minor runtime version: 0x0005 ...... (Omitted) Metadata Header Storage Signature :...... (Omitted) Storage Header: 0x00 Flags 0x0005 Number of Streams Stream 1: 0x0000006c Offset 0x000001e8 Size '#~ 'Name ...... (Omitted) Stream 5: 0x00000510 Offset 0x00000130 Size '# Blob 'name Metadata Stream Header: 0x00000000 Reserved 0x02 Major 0x00 Minor 0x00 Heaps 0x01 Rid 0x0000000900001547 MaskValid 0x000016003325fa00 SortedCode Manager Table: defaultExport Address Table Jumps: // No data.

 

The above Code involves many sections of information, which will be briefly discussed below.

1. Relocation (relocation)

The. reloc section of the image file contains the Fixup table, which contains all positioning items in the image file.. RelocThe RVA and size of the Section are defined by the Base Relocation table directory of the PE Header. The Fixup table consists of positioning blocks, each of which contains a 4 kb page positioning. These blocks are 4-byte aligned.

Each location describes the location of a specific address in the image file and how the operating system loader modifies the address when loading the image file into the memory.

Each positioning block starts with two 4-byte unsigned integers: The RVA of the page, which contains the address and block size to be located. Next, each item of the positioning items on the page is 16 bits in width. The four highest weight bits contain the required relocation types, the remaining 12 digits include the offset of the address to be relocated on the page.

To relocate the address, the operating system loader calculates the difference (DELTA) between the preferred base address (the imagebase field in the PE Header) and the base address of the actually loaded image file ). Then, the delta is applied to the address based on the relocation type. If you load the image file in the preferred location, you do not need to locate it.

It indicates that Windows XP or the latest version supports CLR operating systems, neither starting stub by CLR nor calling CLR by IAT. Therefore, if the CLR header flag indicates that the image file is pure il (comimage_flags _ ilonly), the operating system will completely ignore. RelocSection.

2. Text)

PE File. TextIs read-only. Hosted PE files include metadata tables, il code, import tables, CLR headers, and CLR unmanaged start stub. In the image file generated by the Il assembler, this section also includes hosted resources, strong signature hash values, debugging data, and unmanaged export stub. So. TextSection is the place where hosted PE files are most changed to traditional PE files.

Figure 1-6 summarizes the image files generated by the Il Assembler. TextSection.

 

Figure 1-6. TextSection General Structure

3. Data)

The data section of the image file generated by the Il assembler (. SdataIs a read/write section, which includes data constants, V tables, unmanaged export tables, and TLS directory structures. The declared thread-specific data is located in a different section, that is. TLSSection.

4. Data constants (Data constant)

A Data constant represents the ing of static fields, usually including the initialization data of the ing fields.

Field ing is a method that uses an ANSI string, blob, or structure to initialize static fields. Another method for initializing static fields (a more formal method for CLR) is to explicitly initialize the static fields in the class constructor.

On the one hand, fields mapped to a data section are not covered by the CLR control mechanism as they are in Type Control and garbage collection; on the other hand, they are completely open, it can be accessed and modified without restriction. This will cause the loader to prevent specific field types from being mapped. The type of the ing field cannot include the object reference, vector, array, or any non-public sub-structure. If the class constructor is used for static field initialization, this problem will not occur.

5. V-table (V table)

In purely managed code modules, Table V is used to publish managed methods to unmanaged code for calling. A v table consists of several items, each of which is composed of one or more slots. These items and slots of Table V are defined in Table V positioning. Each location specifies the number and width of slots in each item (4 bytes or 8 bytes ). Each slot in the V table contains metadata tags for each method. These metadata tags are replaced with the method address or sent to the thunk during running and are used to provide an unmanaged portal for the method. Because these locations are executed during running, the V tables hosting PE files must reside in read/write sections. The IL assembler places the V Table. SdataSection, unlike the vtfixup table. Text.

The V Table of the unmanaged image file is completely defined during the link, and only the base address relocated by the operating system loader is required. Because the V table does not need to be changed during execution (for example, replacing the method tag with the address in the hosted image), the V tables of the unmanaged image files can be placed in the read-only section.

6. Unmanaged export table (unmanaged export table)

The unmanaged export table in the unmanaged image file occupies a separate section --. Edata. In the image file generated by the Il assembler, the unmanaged export table and its referenced V Table both reside in. SdataSection.

7. Thread Local Storage (Thread Local Storage)

Ilasm and VC ++ allow users to define data constants that belong to TLS and map static fields to these data constants. TLS is a special storage class. The data object in the class is not a stack variable but a local variable of each independent thread. Therefore, each thread can maintain different values for such variables.

The TLS data is described in the TLS directory, and the Il assembler places it in. SdataSection. The TLS directory structure of the 32-bit image file is defined in winnt. H, as shown in code 1-6.

Code List 1-6 TLS directory structure of 32-bit image files

typedef struct _IMAGE_TLS_DIRECTORY32 {ULONG StartAddressOfRawData;ULONG EndAddressOfRawData;ULONG AddressOfIndex;ULONG AddressOfCallBacks;ULONG SizeOfZeroFill;ULONG Characteristics;} IMAGE_TLS_DIRECTORY32;

 

The TLS directory structure of the 64-bit image (IMAGE_TLS_DIRECTORY64) is similar, except that the four fields at the beginning are 8-byte unsigned integers (ULONGLONG), rather than 4-byte unsigned integers (ULONG ).

The RVA and size of the TLS directory structure are stored in the 10th Data Directories (TLS) in the PE Header. The TLS data constant that forms the TLS template and resides in. TLSSection.

8. Resources)

Two different types of resources can be embedded in managed PE files: Platform-specific unmanaged resources and CLR-specific managed resources. They reside in different sections hosting image files and are accessed through different APIs.

(1) unmanaged resources (unmanaged resources)

Unmanaged resources in PE files. RsrcSection. The initial RVA and size of the embedded unmanaged resources are both indicated in the resource data directory of the PE Header.

Unmanaged resources are indexed by type, name, and language, and sorted in binary order based on these three features.

Create IL Assembler. RsrcAnd embedded in. ResFile. The compiler can only embed one unmanaged resource file into each module.

When the IL anti-assembler analyzes and hosts PE files. RsrcIt reads data and structure from this section,Parallel streamAnd releaseOut of all the unmanaged resources included in the PE File. ResFile.

(2) managed resources)

The Resource field in the CLR header includes the RVA and size of the hosted resources embedded in the PE file. It has nothing to do with the Resource Directory of the PE Header, which specifies the RVA and size of platform-specific unmanaged resources.

In the PE file generated by the IL assembler, the unmanaged resources reside in. RsrcThe managed resources, metadata, and IL code are all located in. TextSection. Managed resources are stored in. TextContinuous storage. Metadata carries manifestresource records. Each record corresponds to a managed resource, it includes the name of the managed resource and the offset at the start of the resource starting from the resource field specified in the CLR header. At this offset, the length of the resource is indicated by a 4-byte unsigned integer. What follows is the resource itself.

When the Il anti-assembler processes the hosted image file and finds the embedded managed resource, it writes each resource to a separate file named by the Resource Name.

When the Il assembler creates a PE file, it reads all the managed resources defined as embedded Resources in the source code based on the Resource Name and writes them. TextAnd place the specified length of the resource before each resource.

----------------------- Note: This article is adapted from section 1.3 of. net Security secrets.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.