Study Notes 3: Introduction to PE files

Source: Internet
Author: User
Previous index:
(1) "inside Microsoft il extends Er" Study Notes 1: a preliminary understanding of IL code
(2) "inside Microsoft il extends Er" Study Notes 2: Make the Il code shorter

The managed module file format is a standard Windows PE format. Therefore, it is necessary to have a little understanding of the PE file format before learning more about the hosting module. Of course, the PE file format was created more than a decade ago and has many descriptions of it (the most classic one is peering inside the PE: a tour of the Win32 portable Executable File Format). I do not need to repeat it here. Just briefly list the content that I think is more important (the main reference materials include the PE tutorial of iczelion ),This is mainly to make your notes more complete.For more information, see other materials (especially if you are not clear about basic concepts such as the differences between RVA and VA ).

1. Two statuses.
When PE files are used, they are directly loaded to a virtual memory address, the image of this file in the memory is no different from the static state it originally saved on the disk. Therefore, although PE is a static physical file, the object is designed to allow the OS to run it easily. At this point, its static and dynamic structures are exactly the same. Of course, some other operations will be performed later to associate the PE with other external environments (such as the caller of the PE. In short, when thinking about the file structure, the consistency of static and dynamic structures is worth noting.

2. Calculation of the actual virtual address of the PE loading point.
A preferred loading address (Virtual Memory Address) is recorded in the PE file, but this address is not required (it may already be occupied by other processes ), therefore, the PE file can be loaded anywhere in the process space (the actual loading location is called the base address ). Because of this, there must be a method to specify the address without relying on the address of the PE loading point. To avoid hard coding the memory address into the PE file, RVA (relative virtual address) is proposed ). RVA is a simple memory offset relative to the PE loading point. For example, if the PE load point is 0x400000CodeIn section 0x401000, RVA is
(Target address) 0x401000-(load address) 0x400000 = (RVA) 0x1000.
The actual address of the loading point of RVA and PE can be converted to the actual address. Therefore, the internal addressing of PE files (that is, when the PE is loaded into the memory, how to address every element in the PE) requires little attention: the virtual address (base address) to which the PE file is loaded. Then, each element in the PE file is addressable based on the base address and the relative offset (RVA) stored in the PE. In terms of calculation method, va-base address = RVA. Therefore, as long as we get the RVA and base address, we can calculate the VA (actually loaded virtual address value) at runtime ).

3. Main Structure in PE: Section
Each PE file is divided into several sections, and some sections are storedProgramCode and data declared and directly used. Some paragraphs store information that needs to be known by the OS. Paragraphs are located in section tables. There is a section table between the PE Header and the data and code contained in the PE, pointing to each section below. Note that this "point" does not point to the location of the Section in the static PE file, but the actual memory address after the PE is loaded into the virtual memory space of the process.

4. import tables and export tables
If a PE file is completely independent (it does not need to introduce any functions of other modules or provide functions for other modules ), the internal addressing of this file is very simple. As long as you understand the virtual address, base address, and RVA concepts, you can easily understand the addressing process. Unfortunately, in most cases, all PE files require "Introducing functions. An introduced function is called by a module but is not in the caller module. Therefore, it is named "Import (import )". The introduced function is actually located in one or more DLL. The caller module only retains some function information, including the function name and its resident DLL name. The second item in the data directory array of the PE file is to introduce the table address. The imported table is actually an array of image_import_descriptor structures. Each structure contains information about a DLL related to the function introduced by the PE file. For example, if the PE file imports functions from 10 different DLL files, the array has 10 members. The array ends with a member of all 0.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.