Analysis on PE File Format extension by. NET Framework
[ArticleSource: http://www.pcvc.net/category/content.asp? Sendid = 140]
Microsoft. NET Framework came out for a while, and I have been in touch with it for the first time since its beta 1. This article starts with a small PE file generated by. Net to understand the extension of. NET Framework to the PE file format. This extension aims to enable the Windows system to recognize the Common Language Runtime (CLR ).
PE files are executable files in Windows operating systems. This document assumes that you have a good understanding of the file format, and does not discuss PE in the previous Win16 and later win64. Before the emergence of CLR, the PE file format is simply composed of PE Header and native image (compared with the CLR header and CLR data section described below. Native image consists of various sections, such. text ,. data ,. rdate, etc. It should be noted that the section naming rules of the PE file do not need to start with a period. In fact, this is only Microsoft'sCodeFor example, other compilers such as Borland name code, data, and so on. Native image contains the machine code of the compiled processor.
After the emergence of CLR, the PE file expands to another part, that is, the supporting part of the. NET Framework that consists of the CLR header and CLR data. The CLR header is defined by the image_cor20_header structure in corhdr. H of the. NET Framework SDK. From the name of corhdr. h or image_cor20_header, the full name of Cor COM + runtime can be implicitly seen. NET Framework development process, and its relationship with COM +. In fact, image_cor20_header is in the WINNT of the Platform SDK. H is also defined. I checked winnt released with Windows XP DDK build 2505. in H, when Microsoft gives this definition, the comment is COM + 2.0 header structure, while in. net Framework SDK is changed to CLR 2.0 header structure. CLR data includes. Net metadata, il method bodies, and so on. Metadata and IL method are key terms in. net. Il is the abbreviation of Microsoft intermediate language. She introduced it for the cross-platform and cross-language features of. net, and has its own instruction set .. . Net SDK opcode. Def lists the supported instruction sets. It seems that these instruction sets are very similar to Intel's X86 instruction sets and are also encoded by dubyte specified by prefix.
The following section describes the execution process of the PE file generated by the C # compiler and the on disk structure of the PE file through the C # console Code listed below. Generation is just a simple output hi, as shown below:
Public class app {
Static public void main (system. String [] ARGs ){
System. Console. writeline ("hi ");
}
}
We simply use /out: app.exe app. CS to compile it. The generated PE file. before the emergence of net, the PE files generated by the traditional compiler are consistent, and also contain image_dos_header. We know that this part is used in early dos when the PE file format is met, this executable file cannot be executed in DOS. Image_dos_header and the structures to be discussed are defined in detail in winnt. h. Windows OS loader locates image_nt_headers next to the e_lfanew member in image_dos_header. It is defined as follows:
Typedef struct _ image_nt_headers {
DWORD signature;
Image_file_header fileheader;
Image_optional_header32 optionalheader;
} Image_nt_headers32, * pimage_nt_headers32;
We know that the member addressofentrypoint of image_optional_header32 is the PE executable file entry, and it is still the execution entry in. Net, which should be well understood. For a comimage_flags_ilonly image (indicated by the image_cor20_header member flags, for example, the app.exewe generate, this entry is also indirectly located in the import table of app.exe _ corexemain function. _ Corexemain corresponds to the EXE file, which is exported by mscoree. dll. Mscoree. dll is located in % winnt % \ system32 and is Microsoft. NET runtime execution engine. It should be noted that she is a native image and is responsible for calling the. NET token specified by entrypointtoken in image_cor20_header. This is the entrance to the real il language.
The positioning of sections in the native image section has been described in many documents and detailed definitions in winnt. h. I will briefly elaborate on the following:
Section positioning, such as. Text and. Data, is specified by the datadirectory member in image_optional_header32. Datadirectory is an image_data_directory array, with the number of mage_numberof_directory_entries (currently 16. Each datadirectory function is specified by image_directory_entry _ ***, such as export and import. Because image_data_directory consists of virtualaddress (RVA) and size, we can easily locate the positions of these sections. Like these sections, the CLR header is also specified as datadirectory, Which is image_directory_entry_comheader (value: 14 ,. net Framework SDK V1 corhdr. the name in H is in the WINNT of DDK 2505. in H, It is image_directory_entry_com_descriptor ). The generated app.exe has the following format:
.
.
.
Addressofentrypoint: 0x000022ce (+ 0x10)
.
.
.
Datadirectory [0]-image_directory_entry_export
Virtualaddress: 0x00000000 (+ 0x60)
Size: 0x00000000 (+ 0x64)
Datadirectory [1]-image_directory_entry_import
Virtualaddress: 0x0000227c (+ 0x68)
Size: 0x0000004f (+ 0x6c)
.
.
.
Datadirectory [14]-image_directory_entry_com_descriptor
Virtualaddress: 0x00002008 (+ 0xd0)
Size: 0x00000048 (+ 0xd4)
.
.
.
OK. From datadirectory [14], we can easily locate the CLR header. CLR headers can be merged into any other section that is read-only. As mentioned above, the CLR header is defined by the image_cor20_header structure.
// CLR 2.0 header structure.
Typedef struct image_cor20_header
{
// Header Versioning
Ulong CB;
Ushort majorruntimeversion;
Ushort minorruntimeversion;
// Symbol table and startup information
Image_data_directory metadata;
Ulong flags;
Ulong entrypointtoken;
// Binding information
Image_data_directory resources;
Image_data_directory strongnamesignature;
// Regular fixup and binding information
Image_data_directory codemanagertable;
Image_data_directory vtablefixups;
Image_data_directory exportaddresstablejumps;
// Precompiled image info (internal use only-set to zero)
Image_data_directory managednativeheader;
} Image_cor20_header;
The flags and entrypointtoken of this structure have been mentioned above. From the perspective of image_data_directory, this definition is similar to image_optional_header32. The latter can be understood as the essence of the PE File Header, which is used to locate sections such as. Text and executed by Windows OS loader. The former is used to locate. Net CLR data, such as metadata, resources, and strongnamesignature. The difference is that image_cor20_header is called by _ corexemain (corresponding to the EXE file) in mscoree. dll (msil language can be executed only after JIT compilation into machine code ).
Although enrtypointtoken and the above addressofentrypoint are both execution portals, there is a big difference. Addressofentrypoint is an RVA that directly points to the execution address (relative to image base). It can only point to a local machine for loading net Runtime (such as mscoree. _ corexemain in DLL, which can be set to 0 for DLL files ). However, entrypointtoken is only a. Net token. Token is the unique identification of. Net type and is a DWORD Value. The maximum 8bit indicates the token. It is defined by cortokentype Enum in corhdr. h. For example, mdtmethoddef is 0x06000000, mdtevent is 0x14000000, and the remaining 24 bits are unique tokens of this class. Enrtypointtoken can only be a method, rather than an event. For example, if the enrtypointtokeno of APP. EXE is 0x06000001, it corresponds to the main method. You can use ildasm.exe (provided with. NET Framework SDK) for verification.
The CLR header of app.exe is as follows (only some non-empty fields are listed ):
Size: 0x00000048
Majorruntimeversion: 0x0002
Minorruntimeversion: 0x0000
Metadata
Virtualaddress: 0x0000207c
Size: 0x00000200
Flags: 0x00000001
Comimage_flags_ilonly
Entrypointtoken: 0x06000001
. Net metadata is specified by the metadata member. Microsoft provides the on disk structure (image_cor_ilmethod) of ilmethod in corhdr. h ). The. NET Framework SDK also provides an example of metainfo used to analyze metadata. The ASP. Net example of class browser with the Quickstart example is also a good learning material for. NET Framework. Metainfo uses the conventional com method, while class browser uses the. NET Framework system. Reflection namespace. For. Net soap, web services, web forms, XML, and so on, Quickstart is really a Quickstart.. Net seems to be the direction of learning for a while.
The last note is. net I have a very fresh feeling, and I am just getting in touch with myself. This article only holds the learning attitude, right as your study notes, and will communicate with you. If there is any error or suggestion in this article, please contact the tsu00@263.net.