PE (portable excutable) File Analysis

Source: Internet
Author: User
Tags intel pentium
PE File Format
Note: PE is short for portable excutable, which refers to the "portable executable" file. It is a standard format for 32-bit windows (including OS/2) executable files. The previous 16-bit Windows Executable file format is called ne, that is, the new excutable "new executable" file. Reference: NE File Format

I. Introduction

The front of the PE file is a DOS executable file (stub), which makes the PE file a legal MS-DOS executable
File.
The DOS file header is followed by a 32-bit PE file mark 0x00004550 (image_nt_signature ).
Next, the PE file header contains the information about the platform for running the program, the number of segments (sections), and files.
The link time, whether it is an executable file (exe), a dynamic link library (DLL), or other.
Followed by an "optional" header (this part always exists, but because coff is in the library (Libraries)
This word is used and is not used in an executable module, but is still called optional ). This part includes
More information about sequential loading: Start address, number of reserved stacks, data segment size, and so on.
An important field in the optional Header is an array called "data directory tables" (Data Directories ).
Each item in is a pointer to a specific segment. For example, if a program has an output directory table (export dire
Ctory), then you will find a pointer for image_directory_entry_export in the data directory table, and
It points to a specific segment.
The following section of the optional Header is "section)
. In fact, the content of the segment is the program you want to actually execute. All the file headers and directory tables described above
And so on.
Each segment has some related signs, such as what data it contains ("Initial Data" or other), it can
Whether the data is shared or not, and the features of the data itself. In most cases (not all), each segment is subject to one or more
The directory table can be located through the "data directory table" entry in the optional Header, just like the output function table or base address relocation.
Table. Is there any segment pointed to by the Directory table, such as executable code or initialization data.
The entire file structure is as follows:
+ ------------------- +
| DOS-stub |
+ ------------------- +
| File-header |
+ ------------------- +
| Optional Header |
|---|
|
| Data Directories |
|
+ ------------------- +
|
| Section headers |
|
+ ------------------- +
|
| Section 1 |
|
+ ------------------- +
|
| Section 2 |
|
+ ------------------- +
|
|... |
|
+ ------------------- +
|
| Section N |
|
+ ------------------- +

The following describes the related virtual addresses (relative virtual addresses)
In PE format files, RVA is often used, that is, the relevant virtual address. It is used to indicate a memory without knowing the base address.
Address. It needs to add the base address to obtain the linear address ).
For example, assume that an executable program is transferred to the memory 0x400000 and the program starts from RVA 0x1560. That
The correct start address is 0x401560. If the executable program is transferred to 0x100000, the start address is 0x101560.
Because each segment of the PE file does not need to be transferred in the same Boundary alignment mode, the calculation of the RVA address becomes more complex.
Miscellaneous. For example, each segment in a file is always aligned in 512 bytes, and the memory may be 4096 bytes
Type alignment. The following sections describe "sectionalignment" and "filealignment ". For example,
Suppose you know that a program is executed from RVA 0x1560, and you want to disassemble it from there. You find the CIDR Block alignment in the memory
The format is 4096 and the. Code segment starts from the memory RVA 0x1560 and has a length of 16384 bytes. you can know that RVA 0x156
0 is at 0x560 of this segment. You can also find that this segment is aligned in 512 bytes in the file and the. Code starts from 0x8 in the file.
00, now you know that the executable program starts at 0x800 + 0x560 = 0xd60.

Ii. Dos header (DOS-stub)
As we all know, the concept of DOS header comes from a 16-bit windows executable program (ne format ).
It is used in OS/2 executable programs, self-extracting documents, and other applications. In PE format files, this part of most programs
Only about 100 bytes of code, only one such as "this program needs Windows NT"
Information.
You can identify a valid DoS header by using a structure called image_dos_header. The first two
Must be "MZ" (# define image_dos_signature "MZ "). How can we find the Start mark of PE?
? You can find it through a member of this structure called "e_lfanew" (offset 60, 32 bits. In o
In S/2 and 16-bit Windows programs, this flag is a 16-bit word. In PE programs, it is a 32-bit dual-word with a value
0x00004550 (# define image_nt_signature 0x00004550 ).

Typedef struct _ image_dos_header {// dos. EXE Header
Word e_magic; // magic number
Word e_cblp; // bytes on last page of File
Word e_cp; // pages in file
Word e_crlc; // relocations
Word e_cparhdr; // size of header in paragraphs
Word e_minalloc; // minimum extra paragraphs needed
Word e_maxalloc; // maximum extra paragraphs needed
Word e_ss; // initial (relative) SS value
Word e_sp; // initial sp value
Word e_csum; // checksum
Word e_ip; // initial IP value
Word e_cs; // initial (relative) Cs value
Word e_lfarlc; // file address of relocation table
Word e_ovno; // overlay number
Word e_res [4]; // Reserved Words
Word e_oemid; // OEM identifier (for e_oeminfo)
Word e_oeminfo; // OEM information; e_oemid specific
Word e_res2 [10]; // Reserved Words
Long e_lfanew; // file address of New EXE Header
} Image_dos_header, * pimage_dos_header;

3. File Header)
Through the DOS header, you can find a structure called image_file_header, as shown below; next I will introduce
.

Typedef struct _ image_file_header {
Word machine; // 0x04
Word numberofsections; // 0x06
DWORD timedatestamp; // 0x08
DWORD pointertosymboltable; // 0x0c
DWORD numberofsymbols; // 0x10
Word sizeofoptionalheader; // 0x14
Word characteristics; // 0x16
} Image_file_header, * pimage_file_header;

MACHINE: Environment and platform for executing the program. The values are as follows:
Image_file_machine_i386 (0x14c)
Intel 80386 or above Processors
0x014d
Intel 80486 or above Processors
0x014e
Intel Pentium processor or above
Zero X 0160
R3000 (MIPs) Processors
Image_file_machine_r3000 (0x162)
R3000 (MIPs) processor, low front
Image_file_machine_r4000 (0x166)
R4000 (MIPs) processor, low front
Image_file_machine_r10000 (0x168)
R10000 (MIPs) processor, low front
Image_file_machine_alpha (0x184)
DEC Alpha AXP Processor
Image_file_machine_powerpc (0x1f0)
IBM Power PC
Numberofsections: number of segments. The concept of segments is described below.
Timedatestamp: the time when the file was created. You can use this value to differentiate different versions of the same file, even if
Their business version numbers are the same. The format of this value is not clearly defined, but it is clear that most C Compilers
Set it to the number of seconds (time_t) Since 00:00:00 ). This value is sometimes used as a binding input directory table.
, Which will be described below.
Note: Some compilers ignore this value.
Pointertosymboltable and numberofsymbols
But they are found to be 0.
Sizeofoptionalheader: length of the optional Header (sizeof image_optional_header) You can use it
To verify the correctness of PE files.
Characteristics: a collection of tokens. Most of the bits are used in the target file (OBJ) or library file.
(LIB:
Bit 0 (image_file_relocs_stripped): Set 1 to indicate that there is no redirection information in the file. Each segment has
Has their own redirection information. This flag is not used in an executable file.
Base Address redirection directory table to indicate redirection information, which will be described below.
Bit 1 (image_file_executable_image): Set 1 to indicate that the file is an executable file (that is
Is not a target file or library file ).
Bit 2 (image_file_line_nums_stripped): Set 1 to indicate that there is no row information; in the executable file
Is not used.
Bit 3 (image_file_local_syms_stripped): Set 1 to indicate that there is no local symbol information.
File is not used.
Bit 4 (image_file_aggresive_ws_trim ):
Bit 7 (image_file_bytes_reversed_lo)
Bit 15 (image_file_bytes_reversed_hi): indicates the object's byte sequence.
Hope, then exchange before reading. They are untrusted In executable files (the operating system expects to follow the correct
Byte sequence execution program ).
Bit 8 (image_file_32bit_machine): indicates that you want the machine to be a 32-bit machine. This value is always 1.
Bit 9 (image_file_debug_stripped): indicates that no debugging information is available.
.
Bit 10 (image_file_removable_run_from_swap): setting 1 indicates that the program cannot run on the movable
In motion media (such as a floppy or CD-ROM ). In this case, the OS must copy the file to the swap file for execution.
Bit 11 (image_file_net_run_from_swap): setting 1 indicates that the program cannot run online. In this
In this case, the OS must copy the file to the swap file for execution.
Bit 12 (image_file_system): Set 1 to indicate that the file is a system file, such as a driver. In the executable
Row files are not used.
Bit 13 (image_file_dll): Set 1 to indicate that the file is a dynamic link library (DLL ).
Bit 14 (image_file_up_system_only): indicates that the file is designed to be unable to run on a multi-processor system.
Unified.

4. optional Header (optional Header)
The following is an optional Header, which is a structure called image_optional_header. It contains a lot of information about
PE file location information. The following sections describe:
Typedef struct _ image_optional_header {
//
// Standard fields.
//
Word magic; // 0x18
Byte majorlinkerversion; // 0x1a
Byte minorlinkerversion; // 0x1b
DWORD sizeofcode; // 0x1c
DWORD sizeofinitializeddata; // 0x20
DWORD sizeofuninitializeddata; // 0x24
DWORD addressofentrypoint; // 0x28
DWORD baseofcode; // 0x2c
DWORD baseofdata; // 0x30
//
// Nt additional fields.
//
DWORD imagebase; // 0x34
DWORD sectionalignment; // 0x38
DWORD filealignment; // 0x3c
Word majoroperatingsystemversion; // 0x3e
Word minoroperatingsystemversion; // 0x40
Word majorimageversion; // 0x42
Word minorimageversion; // 0x44
Word majorsubsystemversion; // 0x46
Word minorsubsystemversion; // 0x48
DWORD win32versionvalue; // 0x4c
DWORD sizeofimage; // 0x50
DWORD sizeofheaders; // 0x54
DWORD checksum; // 0x58
Word subsystem; // 0x5c
Word dllcharacteristics; // 0x5e
DWORD sizeofstackreserve; // 0x60
DWORD sizeofstackcommit; // 0x64
DWORD sizeofheapreserve; // 0x68
DWORD sizeofheapcommit; // 0x6c
DWORD loaderflags; // 0x70
DWORD numberofrvaandsizes; // 0x74
Image_data_directory datadirectory [image_numberof_directory_entries];
} Image_optional_header, * pimage_optional_header;

Magic: The value is always 0x010b.
Majorlinkerversion and minorlinkerversion: the version number of the linker. This value is unreliable.
Sizeofcode: the length of the executable code.
Sizeofinitializeddata: the length of the initialized data (data segment ).
Sizeofuninitializeddata: the length of uninitialized data (BSS segment ).
Addressofentrypoint: The RVA address of the code entry, where the program starts to run.
Baseofcode: the starting position of the executable code, which is of little significance.
Baseofdata: the initial position of the initialization data, which is of little significance.
Imagebase: the preferred RVA address for loading programs. This address can be changed by loader.
Sectionalignment: Alignment of segments in the memory after loading.
Filealignment: Specifies the alignment of segments in the file.
Majoroperatingsystemversion and minoroperatingsystemversion: operating system version, load
Er does not use it.
Majorimageversion and minorimageversion: program version.
Majorsubsystemversion and minorsubsystemversion: subsystem version. Supported by this domain system;
For example, if the program runs in NT and the subsystem version is not 4.0, the dialog box cannot display the 3D style.
Win32versionvalue: The value is always 0.
Sizeofimage: memory size (in bytes) occupied after the program is transferred, equal to the sum of the length of all segments.
Sizeofheaders: The sum of the lengths of all file headers, which is equal to the original data from the beginning of the file to the first segment
.
Checksum: Checksum. It is only used in the driver and can be 0 in the executable file. Its Calculation Method mic
Rosoft is not public. The checksummappedfile () function in imagehelp. dll can calculate it.
Subsystem: NT subsystem, which may be the following values:
Image_subsystem_native (1)
Subsystem is not required. Used in the driver.
Image_subsystem_windows_gui (2)
Win32 graphical program (which can use allocconsole () to open a console, but cannot
).
Image_subsystem_windows_cui (3)
Win32 console Program (which can be automatically created at the beginning ).
Image_subsystem_os2_cui (5)
OS/2 console Program (because the program is in OS/2 format, it is rarely used in PE ).
Image_subsystem_posix_cui (7)
POSIX console program.
Windows 95 programs always use Win32 subsystems, so only 2 and 3 are valid values.
Dllcharacteristics: DLL status.
Sizeofstackreserve: reserve the stack size.
Sizeofstackcommit: Number of actually applied stacks after startup, which can increase as the actual situation arises.
Sizeofheapreserve: reserve the heap size.
Sizeofheapcommit: actual heap size.
Loaderflags: It seems useless.
Numberofrvaandsizes: number of entries in the directory table below. This value is also unreliable. You can use the constant image_n.
Umberof_directory_entries to replace it. The value seems to be equal to 16 in total.
Datadirectory: it is an image_data_directory array. The number of array elements is image_numberof _
The structure of directory_entries is as follows:
Typedef struct _ image_data_directory {
DWORD virtualaddress;
DWORD size;
} Image_data_directory, * pimage_data_directory;
Virtualaddress: the starting RVA address.
Size: length.
Each directory table represents the following values:
Image_directory_entry_export (0)
Image_directory_entry_import (1)
Image_directory_entry_resource (2)
Image_directory_entry_exception (3)
Image_directory_entry_security (4)
Image_directory_entry_basereloc (5)
Image_directory_entry_debug (6)
Image_directory_entry_copyright (7)
Image_directory_entry_globalptr (8)
Image_directory_entry_tls (9)
Image_directory_entry_load_config (10)
Image_directory_entry_bound_import (11)
Image_directory_entry_iat (12)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.