In the previous section, we have a general understanding of the role of each part of the PE file. From this section, we will further explain each part of the PE file, of course, don't forget the two questions I raised in the previous section.
1. Dos MZ header and Dos Stub:
All PE files (or even 32-bit DLLs) must start with a simple dos MZ header. We are usually not very interested in this structure. With it, once the program is executed in DOS, DOS can identify this as a valid execution body and then run the Dos Stub following the MZ header. Dos Stub is actually a valid exe. In an operating system that does not support the PE file format, it will simply display an error message, similar to the string "this program requires Windows" or programmers can implement complete dos code based on their own intentions. Usually we are not too interested in Dos Stub either: because in most cases it is automatically generated by the assembler/compiler. Generally, it calls to interrupt 21h Service 9 to display the string "this program cannot run in DOS mode ". When a 32-bit program is run in window95, this part will not be mapped to the linear address space by the loader. When the Win32 loader maps a PE file to the memory, the first byte of the memory image file (memory mapped file) corresponds to the first byte of the Dos Stub. Winnt. h defines a structure for Dos Stub Header DoS MZ header, the first domain e_magic, known as magic number, which is used to represent a MS-DOS compatible file type, similar to the signature field in the PE Header, this value is set to 0x5a4d for all MS-DOS-compatible executable files, representing the ASCII character MZ. This is why MS-DOS headers are sometimes called MZ headers. There are also many other domains that are useful for MS-DOS operating systems, but for Windows NT, this structure has only one useful domain-the last domain e_lfnew, the PE Header is located by it. Following this we will easily find the PE Header, which is a relative offset value (or RVA) pointing to the real PE Header ). To obtain the pointer, you must add the base address of the image to RVA:
PNTHeader = dosHeader + dosHeader-> e_lfanew;
With this pointer to the PE Header, we can obtain a lot of useful information. Since we are studying the PE file format, the PE Header is the focus of our research. In short, the relationship between dos mz header and DOS Stub is equivalent to the relationship between PE header and EXE or DLL.
2. PE Header:
The PE header is short for the IMAGE_NT_HEADERS of the PE-related structure. It contains important fields used by many PE loaders. When we look at the PE file format in depth, we will be able to look at these important domains. When the execution body is executed in an operating system that supports the PE file structure, the PE Loader finds the start offset of the PE header from the dos mz header. Therefore, the real file header PE header is located directly without DOS stub. The PE Header is an IMAGE_NT_HEADERS structure defined in WINNT. H. This structure is exactly the module database of Windows 95 (the concept of "module" is described in Section 1, the operating system uses this structure to perceive the existence of the "MODULE" and obtain information about the "MODULE". This structure will be mentioned in the future "MODULE" learning ). Each loaded EXE or DLL is represented in an IMAGE_NT_HEADERS structure. This structure has a DWORD and two sub-structures:
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER OptionalHeader;
(1) The content of the file Signature field in PE format should be ascii pe/0/0.
(2) IMAGE_FILE_HEADER FileHeader:
Field name
Meanings
Machine
The CPU required for running the file. For Intel Platform, the value is IMAGE_FILE_MACHINE_I386 (14Ch ). We tried luevelsmeyer's pe.txt statement 14Dh and 14Eh, but Windows cannot execute it correctly. This domain seems to be of little use to us except to prohibit program execution.
NumberOfSections
The number of file sections. If we want to add or delete a section in the file, we need to modify this value.
TimeDateStamp
File creation date and time. We are not interested.
PointerToSymbolTable
Used for debugging.
NumberOfSymbols
Used for debugging.
SizeOfOptionalHeader
Indicates the size of the OptionalHeader structure followed by the current structure. It must be a valid value.
Characteristics
Mark the file information, such as whether the file is exe or dll.
IMAGE_FILE_HEADER has a simple structure and is easy to understand. We will not explain it too much here. In short, there are only three fields for use: Machine, NumberOfSections, and Characteristics. Generally, the Machine and Characteristics values are not changed, but NumberOfSections must be used to traverse the section table.
(3) More complex and interesting is the third thing: IMAGE_OPTIONAL_HEADER. Now we will learn the final optional header structure in IMAGE_NT_HEADERS, which contains the logical distribution information of PE files. This structure has 31 fields, some of which are critical and others are not commonly used. Here we will only introduce the useful domains.
Field
Meanings
AddressOfEntryPoint
The RVA of the first instruction of the PE file to be run by the PE Loader. If you want to change the entire execution process, you can specify this value to the new RVA, so that the commands at the new RVA are executed first.
ImageBase
Priorities of PE files. For example, if the value is 400000 h, the PE Loader will try to mount the file to h of the virtual address space. The word "Priority" indicates that if the address area is occupied by other modules, the PE Loader selects other idle addresses.
SectionAlignment
The granularity of node alignment in memory. For example, if the value is 4096 (1000 h), the starting address of each section must be a multiple of 4096. If the first section starts from H and the size is 10 bytes, the next section must start from H, even if there is a lot of space between H and H is not used.
FileAlignment
The granularity of section alignment in the file. For example, if the value is (200 h), the starting address of each section must be a multiple of 512. If the first section starts from the file offset of 400 h and the size is 10 bytes, the next section must be at the offset of 512 h, even if there is still a lot of space between the offset of 1024 and not used/defined.
MajorSubsystemVersion
MinorSubsystemVersion
Win32 subsystem version. If the PE file is specially designed for Win32, the subsystem version must be 4.0. Otherwise, the dialog box will not have a three-dimensional stereoscopic effect.
SizeOfImage
The size of the PE image in the memory. It is the size of all headers and sections after the section alignment processing.
SizeOfHeaders
The size of all headers and section tables is equal to the file size minus the size of all sections in the file. The offset of the First Section of the PE file.
Subsystem
NT is used to identify the subsystem of a PE file. For most Win32 programs, there are only two types of values: Windows GUI and Windows CUI (console ).
SizeOfStackReserve
The size of the initial stack of the thread.
SizeOfStackCommit
The amount of memory that is committed (committed) to the thread's initial stack at the beginning.
SizeOfHeapReserve
The number of virtual memory retained to the original process heap.
SizeOfHeapCommit
The amount of memory that is committed (committed) to process heap at the beginning.
DataDirectory
IMAGE_DATA_DIRECTORY structure array. Each structure provides an important data structure RVA, such as the introduction of address tables.
The most difficult to understand in the above table is the last most important field, that is, DataDirectory; it is a structure array, which contains 16 elements, that is, a total of 16 structures; each structure corresponds to a section (note that the section is divided according to the function of section 1, not the section contained in the final generated PE file ), the two fields in the structure describe the RVA and SIZE of the section, so that the loader can quickly find a specific section in the image through this array, and the import table described later, the corresponding elements in the array are used in the output table, and further explanations will be given at that time.