PE File Learning

Source: Internet
Author: User

This article link: http://blog.csdn.net/u012763794/article/details/51469477

1. Introduction
What is a PE file?

The PE file is the executable file format used under the Windows operating system. 32 bit is directly called PE or pe32,64 bit of pe+ or pe32+, note not PE64 Oh!!!


Learning PE files is actually learning the structure, which stores how to load into the memory, from where to run, run the need for those DLLs, how much stack and memory needed, etc.


First, look at the PE file, probably including the structure.

2.PE Head The following example uses Notepad.exe as an instanceDOS head look SDK WinNT.h, it's a structure.

First we can calculate the size of the structure, is 64, that is, 0x40 bytes with the results of the program in line with

We use Winhex (16 binary editor on line), open Windows Notepad (notepad.exe)Look, the selected DOS header, just 0x40 bytes


We focus on the first member and the last

E_magic: All PE starts with a DOS signature "MZ", named after the designer of a DOS executable named Mark Zbikowski, looking at the first two bytes on the graph

E_lfanew: Point to NT header position, long type, account for 4 bytes, see is 0x0000000e, note is the small end mode Oh, different programs differ

The arrow position is NT head.



A DOS stub is an option, not a fixed size
According to the last member of the DOS header, the following is a DOS stub that is framed

Inside is a 16-bit assembly code, with debug can see (u:unassemble), 64-bit system does not have this OH

We run a DOS environment to see (G command), you can see that the output cannot be run in DOS mode


NT head look SDK, just 64 bit on the top, you can see there are 3 members, 64-bit is just image_optional_header32 expansion into image_optional_header64


Size is 0xf8

Signature structure Body
The first signature structure has a value of 0x50450000 ("PE" 00)

NT header File Header---Image_file_header
You can see that there are 7 of members
0x14 bytes, which is 20 bytes
So that's the next selection.

WORD machine; Each CPU has a unique machine code, compatible with the 32-bit Intel x86 machine Code of 14C, specifically see below

#define Image_file_machine_unknown 0#define image_file_machine_i386 0x014c//Intel 386. #define IM             age_file_machine_r3000 0x0162//MIPS Little-endian, 0x160 big-endian#define image_file_machine_r4000 0x0166//MIPS Little-endian#define image_file_machine_r10000 0x0168//MIPS Little-endian#define IMAGE_  File_machine_wcemipsv2 0x0169//MIPS Little-endian WCE v2#define image_file_machine_alpha 0x0184//            Alpha_axp#define image_file_machine_sh3 0x01a2//SH3 little-endian#define IMAGE_FILE_MACHINE_SH3DSP               0x01a3#define image_file_machine_sh3e 0x01a4//sh3e little-endian#define IMAGE_FILE_MACHINE_SH4 0X01A6//SH4 little-endian#define image_file_machine_sh5 0x01a8//Sh5#define Image_file_machine _arm 0x01c0//arm little-endian#define image_file_machine_thumb 0x01c2//arm thumb/thumb-2 L Ittle-endian#define IMAGE_file_machine_armnt 0x01c4//ARM Thumb-2 little-endian#define image_file_machine_am33 0x01d3#d         Efine image_file_machine_powerpc 0x01f0//IBM POWERPC little-endian#define IMAGE_FILE_MACHINE_POWERPCFP 0x01f1#define image_file_machine_ia64 0x0200//Intel 64#define IMAGE_FILE_MACHINE_MIPS16 0x026 6//Mips#define image_file_machine_alpha64 0x0284//Alpha64#define IMAGE_FILE_MACHINE_MIPSFPU 0x03 Mips#define image_file_machine_mipsfpu16 0x0466//Mips#define image_file_machine_axp64 IMAGE_               File_machine_alpha64#define image_file_machine_tricore 0x0520//Infineon#define IMAGE_FILE_MACHINE_CEF             0x0cef#define IMAGE_FILE_MACHINE_EBC 0X0EBC//EFI Byte code#define image_file_machine_amd64 0x8664//AMD64 (K8) #define IMAGE_FILE_MACHINE_M32R 0x9041//m32r little-endian#define Image_file_mac Hine_cee 0Xc0ee 
WORD numberofsections;

How many sections, must be greater than 0, such as the basic, what code area (. Text), data (. rsrc), the resource (.) area, if the actual discrepancy is wrong
DWORD TimeDateStamp;

When the file was created
DWORD pointertosymboltable;

It points to a table, is rarely used, and is not introduced online.
DWORD Numberofsymbols;

The number of tables above
WORD Sizeofoptionalheader;

Describes the size of the Image_optional_header32 (optional header)
WORD characteristics;

The attribute used to identify the file (0x0002,2000h must remember that one is an executable file, one is a DLL), specifically

#define IMAGE_FILE_RELOCS_STRIPPED 0x0001//relocation info stripped from FILE. #define Image_file_executable_i MAGE 0x0002//File is executable (i.e. no unresolved externel references). #define Image_file_line_nums_strippe D 0x0004//Line nunbers stripped from file. #define IMAGE_FILE_LOCAL_SYMS_STRIPPED 0x0008//LOCAL symbols Stripped from file. #define Image_file_aggresive_ws_trim 0x0010//agressively TRIM working set#define Image_file_  Large_address_aware 0x0020//App can handle >2GB addresses#define Image_file_bytes_reversed_lo 0x0080 Bytes of machine Word is reversed. #define Image_file_32bit_machine 0x0100//+ bit word machine. #define image_file_debug_stripped 0x0200//debugging info stripped from FILE in.  DBG file#define image_file_removable_run_from_swap 0x0400//If IMAGE is on removable media, copy and RUN from the SWAP File. #define IMAGE_FILE_NET_RUN_FROM_SWAP 0x0800//If Image is in Net, copy and run from the swap file. #define Image_file_system 0x1000//SYSTEM file.  #define Image_file_dll 0x2000//FILE is a DLL. #define IMAGE_FILE_UP_SYSTEM_ONLY 0x4000 File should only is run on a up machine#define Image_file_bytes_reversed_hi 0x8000//BYTES of machine Word a Re reversed.

NT head of the optional head-------IMAGE_OPTIONAL_HEADER32 structure is very large, first look at the definition of the predecessor

typedef struct _IMAGE_DATA_DIRECTORY {    DWORD   virtualaddress;    DWORD   Size;} Image_data_directory, *pimage_data_directory; #define Image_numberof_directory_entries    16


The true definition of typedef struct _IMAGE_OPTIONAL_HEADER {////Standard fields.    WORD Magic;    BYTE majorlinkerversion;    BYTE minorlinkerversion;    DWORD Sizeofcode;    DWORD Sizeofinitializeddata;    DWORD Sizeofuninitializeddata;    DWORD Addressofentrypoint;    DWORD Baseofcode;    DWORD Baseofdata;    NT additional fields.    DWORD ImageBase;    DWORD sectionalignment;    DWORD FileAlignment;    WORD majoroperatingsystemversion;    WORD minoroperatingsystemversion;    WORD majorimageversion;    WORD minorimageversion;    WORD majorsubsystemversion;    WORD minorsubsystemversion;    DWORD Win32versionvalue;    DWORD Sizeofimage;    DWORD sizeofheaders;    DWORD CheckSum;    WORD Subsystem;    WORD DllCharacteristics;    DWORD Sizeofstackreserve;    DWORD Sizeofstackcommit;    DWORD Sizeofheapreserve;    DWORD Sizeofheapcommit;    DWORD Loaderflags;    DWORD numberofrvaandsizes; Image_data_DIRECTORY datadirectory[image_numberof_directory_entries];} Image_optional_header32, *pimage_optional_header32;


//
Standard fields.
//


WORD Magic; 32 bits for 10b,64 bit 20B
BYTE majorlinkerversion; Main link version
BYTE minorlinkerversion; Secondary link version
DWORD Sizeofcode; Size of the Code section
DWORD Sizeofinitializeddata; The size of the initialized data
DWORD Sizeofuninitializeddata; Uninitialized data size
DWORD Addressofentrypoint; The RVA of the entry point, which is important to indicate where the code is first executed
DWORD Baseofcode; RVA (relative virtual address) of the code snippet
DWORD Baseofdata; RVA of data points


//
NT additional fields.
//
DWORD ImageBase; File preferred recommended address, runtime, EIP is set to Imagebase+addressofentrypoint
DWORD sectionalignment; The section area is aligned, that is, the minimum number of knots
DWORD FileAlignment; file alignment, the minimum number of files, not enough to fill 0
WORD majoroperatingsystemversion; Operating system minimum major version
WORD minoroperatingsystemversion; Minimum minor version
WORD majorimageversion; Major version of executable file
WORD minorimageversion; Minor version
WORD majorsubsystemversion;Minimum child operating system major version
WORD minorsubsystemversion; Minor version
DWORD Win32versionvalue; Reserved fields
DWORD Sizeofimage; Total size after loading in memory

DWORD sizeofheaders; PE head size (DOS Head, up to section table header)
DWORD CheckSum; Checksum
WORD Subsystem; The subsystem type of the executable file (either a system driver or a normal executable file)
WORD DllCharacteristics; DLL file type
DWORD Sizeofstackreserve; The stack size reserved for threads
DWORD Sizeofstackcommit; Submitted stack size
DWORD Sizeofheapreserve; Reserved heap Size
DWORD Sizeofheapcommit; Committed Heap Size
DWORD Loaderflags; Was abandoned.
DWORD numberofrvaandsizes; Number of Data Catalog items, that is, the number of datadirectory below
Image_data_directory Datadirectory[image_numberof_directory_entries];Data Catalog tables, including input and output tables, resources, etc.


Look at the exact ones, just the. Text Front


0xe0 bytes, 224 bytes, 16*14 bytes, that's 14.

(The left-hand address is 10 binary)



The focus is on the last member Image_data_directory Datadirectory[image_numberof_directory_entries];

Although a constant is defined as 16, it is ultimately determined by the second-to-last variable numberofrvaandsizes

15 items are defined in WinNT.h, and the last one is reserved

#define IMAGE_DIRECTORY_ENTRY_EXPORT 0//EXPORT directory#define Image_directory_entry_import 1// Import directory#define Image_directory_entry_resource 2//RESOURCE directory#define IMAGE_DIRECTORY_ENTRY_EXCEP tion 3//Exception Directory#define image_directory_entry_security 4//SECURITY Directory#define Image_ Directory_entry_basereloc 5//Base relocation Table#define image_directory_entry_debug 6//DEBUG Dir  ectory//image_directory_entry_copyright 7//(X86 usage) #define IMAGE_DIRECTORY_ENTRY_ARCHITECTURE 7//              Architecture specific Data#define image_directory_entry_globalptr 8//RVA of Gp#define Image_directory_entry_tls 9//#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG//LOAD Configuration directory#define image_directo Ry_entry_bound_import//BOUND IMPORT Directory in Headers#define image_directory_entry_iat//Impo RT Address Table#defineImage_directory_entry_delay_import//DELAY Load IMPORT descriptors#define image_directory_entry_com_descriptor 14 COM Runtime Descriptor

Indexed to 7 the different CPU architectures are chosen differently.


Because the DataDirectory array holds the import table (which DLLs are used), export the table, TLS (Thread Local Storage) directory, etc. RVA and size information


About exporting tables:

Here is a blog original

English: When the PE loader runs a program, it loads the associated DLLs into the process address space. It then extracts information on the import functions from the main program. It uses the information to search the DLLs for the addresses of the functions to being patched into the main program. The place in the DLLs where the PE loader looks for the addresses of the functions are the export table.

I may not be the right translator.

Chinese: When the PE loader runs a program that loads the DLL of the import table into the corresponding address space of the process, then expands the information for the import function from the main function, and then searches the DLL for the address of the imported function, followed by the location where the function is called by the main function. Then the export table is where the DLL finds a function address.

Export table, export it is for others, you have to tell others where the address of the function


Where is the last member? Because an array takes 8 bytes, a total of 16, from the back count.

(The left-hand address is 10 binary)


You can see that the Export table with index 0 is all 0, because it is Notepad, the index does not export the table, the general DLL will have

Index 1 is the import table, you can see the RVA is 0x7604, but see in the file to be converted to a file offset, because the file is stored in the memory of the relative address

Here is the concept of the section Head, see the next section of the node area head, (in four bytes, each section header fourth four bytes, do not understand can see the section head structure in the virtualaddress position) virtualaddress is the RVA of the section area

You can see that the RVA of the. Text is 0x1000,.data RVA, and the RVA of 0X9000,.RSRC is 0xb000

It's obvious that 0x7604 is in the. Text section, so it's the starting position of the. Text is 0x7604-0x1000=0x6604,

Since the starting position of the. Text is the same regardless of the file or in memory, the 0x6604 is also the distance from the file. The starting position of text

Finally add the file offset (pointertorawdata) of the starting position of the. Text, pointertorawdata or in the section header, see below, virtualaddress two more 4 bytes, or look at the picture bar

(The left-hand address is 10 binary)


So finally find out the 0x7604 file offset is 0x0400 + 0x6604 = 0x6a04

Let's see if 0X6A04 has the information to import the table, that is, the information to import the table structure.


You can see exactly 5 double words, that is, 20 bytes

typedef struct _IMAGE_IMPORT_DESCRIPTOR {    union {        DWORD   characteristics;            0 for terminating null import descriptor        DWORD   originalfirstthunk;         RVA to original Unbound IAT (pimage_thunk_data)    } dummyunionname;    DWORD   TimeDateStamp;                  0 if not bound,                                            //1 if bound, and real date\time stamp                                            //In     Image_directory_entry_bound_import (new B IND)                                            //O.W. Date/time stamp of DLL bound to (old BIND)    DWORD   forwarderchain;                 -1 if no forwarders    DWORD   Name;    DWORD   Firstthunk;                     RVA to IAT (if bound this IAT have actual addresses)} Image_import_descriptor;typedef image_import_descriptor UNALIGNED *pimage_import_descriptor;

Node Area head

#define Image_sizeof_short_name              8typedef struct _image_section_header {    BYTE    Name[image_sizeof_short_ NAME];    Union {            DWORD   physicaladdress;            DWORD   virtualsize;    } Misc;    DWORD   virtualaddress;    DWORD   Sizeofrawdata;    DWORD   Pointertorawdata;    DWORD   pointertorelocations;    DWORD   pointertolinenumbers;    WORD    numberofrelocations;    WORD    numberoflinenumbers;    DWORD   characteristics;} Image_section_header, *pimage_section_header;


BYTE Name[image_sizeof_short_name]; Save the name of a section, usually starting with a point
Union {
DWORD physicaladdress;
DWORD VirtualSize; The size of the section area in memory, not necessarily the memory-aligned value
} Misc;
DWORD virtualaddress; The starting RVA of the in-memory section area
DWORD Sizeofrawdata; The size of the section area on the disk
DWORD Pointertorawdata; Start RVA of section area in disk
DWORD pointertorelocations;
DWORD pointertolinenumbers;
WORD numberofrelocations;
WORD numberoflinenumbers;
DWORD characteristics; Section attribute, composed of the following combination (bitwise OR, bit or)


#define Image_scn_cnt_code 0x00000020//section contains CODE.
#define IMAGE_SCN_CNT_INITIALIZED_DATA 0x00000040//section contains INITIALIZED DATA.
#define IMAGE_SCN_CNT_UNINITIALIZED_DATA 0x00000080//section contains uninitialized DATA.


#define IMAGE_SCN_MEM_SHARED 0x10000000//section is shareable.
#define Image_scn_mem_execute 0x20000000//section is executable.
#define Image_scn_mem_read 0x40000000//section is readable.
#define Image_scn_mem_write 0x80000000//section is writeable.

Like what

Code area: with execute, read permission

Data area: non-executable, read and write permissions

Ziyuan District: non-executable, read access


specifically see where in Notepad, first size 0x28=40 bytes


By displaying the name we see there are 3 section headers, the first field is the name of the section area is also in line with the

. Text section Header


. Data section Header


. rsrc Section Head



All right, let's do this.

PE File Learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.