Notes on programmer self-cultivation-loading executable files

Source: Internet
Author: User
Tags field table


The commands and data required for program execution must be in the memory for normal operation. The simplest way is to load all the commands and data required for program running into the memory, in this way, the program can be smoothly executed, which is the simplestStatic Loading
. However, the amount of memory required by the program may be greater than the physical memory, so static loading is not realistic.

 


Later research found that the program runs with a local principle, so we can store the most common part of the program in the memory, and some less commonly used data in the disk, this isDynamic Loading
.

 



Overwrite loading and page ing
Is a typical dynamic loading method. Overwrite loading means that if the two modules do not run at the same time, the two modules can share a piece of memory, and the modules will be loaded when needed. Now it is basically eliminated.

 


Page ing is part of the virtual storage mechanism, which was born with the invention of virtual storage. In the page ing mechanism, the unit of program loading and operation is page. The most common inter ia32 processor generally uses a page of 4 kb.

 


Assuming the sum of all the commands and data of the program is 32 KB, the program is divided into 8 pages in total, numbered as P0-O7, and assumed that the physical memory is only 16 KB, numbered as F0-F3. If the entry address of the program at the beginning of execution is P0, then the load manager finds that the program P0 is not in the memory, so it allocates F0 to P0 and loads the content of P0 into F0; after running for a period of time, the program needs to use P5, so the load manager loads P5 into F1. In this way, when the program uses P3 and P6, they are loaded into F2 and F3 respectively.

 


Obviously, if the program only needs the P0, P3, P5, and P6 pages, the program will continue to run. However, if you need to access P4 at this time, the load manager needs to be selected at first, and it must discard one of the four physical memory pages currently in use to load P4. You can choose from multiple algorithms, such as the first-in-first-out (FIFO) algorithm and least-recently-used (LRU) algorithm.

 


In fact, the load manager mentioned above is the storage manager of the operating system.

 

From the operating system perspective, the loading of executable files


 

From the operating system perspective, the most critical feature of a process is that it has an independent virtual address space, which makes it different from other processes. In

Http://blog.csdn.net/vividonly/archive/2011/05/04/6393516.aspx
This article provides a detailed explanation. To execute an executable program, you must first create a process, then load the corresponding executable file and execute it. In the case of virtual storage, the above process requires only three tasks at the beginning:


1. Create an independent virtual address space. At this time, the ing between the virtual address page and the physical address page is not set. These ing relationships will be set when a page error occurs in the subsequent program. When a page error occurs, the operating system allocates a physical page from the physical memory,Then, read the missing page from the disk to the memory, and then set the ing between the virtual page and the physical page.
. This is achieved through the MMU of the CPU.

 


2. Read the executable file header (the location of the field table can be found through the ELF File Header information to locate the location and information of each segment .), Create a ing between the virtual address space and the executable file (discussed later ). This ing is only a Data Structure Stored in the operating system. In Linux, a segment in the process virtual address space is called the virtual memory area (VMA, virtual memory area ). For example. text Segment. After a process is created, a file is set in the data structure of the process. the text ing relationship of the Text Segment records its address in the Virtual Space and Its offset in the ELF File. When a page error occurs during program execution, find the data structure, find the VMA of the empty page, and calculate the offset of the corresponding page in the executable file, allocate a physical page in the physical memory, establish a ing between the virtual page in the process and the allocated physical page, and then return the control to the process, the process was re-executed when the page was incorrect.

 

 


3. Set the instruction register of the CPU to the entry address of the executable file to start running. The operating system transfers control to the process by setting the CPU instruction register. The entry address is the entry address in the ELF file header.

 


In the operating system, VMA is used to map different segments in the executable file. You can also use VMA to manage the address space of the process. In fact, the "stack" and "Heap" in the process address space also exist in the form of VMA.

 


Next, let's take a look at the specific process of establishing the ing between the virtual address space and the executable file, that is, the specific implementation process in the second step above. Before the discussion, we should first pay attention to the fact that the ELF file is mapped in units of the system page length, the ing of each segment should be an integer multiple of the system page length. If not, the additional part will also occupy one page. An elf file usually has more than a dozen segments, so the memory waste can be imagined. In fact, in an elf file, the segment permission is usually one of the few combinations:

  • The permissions represented by code segments are readable and executable segments.
  • The permissions represented by data segments and BSS segments are readable and writable segments.
  • The permission represented by the read-only data segment is read-only.


The solution for Linux is to merge the segments with the same permissions as a segment for ing. For example, two segments are called ". text "and ". init ", which contains the executable code and initialization code of the program. If the permissions are the same and they are both readable and executable, You can merge them to save space.

 


The elf executable file has a special data structure called the program header table, which is used to save information about each VMA. In fact, it is the data structure that saves the ing between the virtual address space and executable files discussed above. The main tasks in step 1 can be summarized as follows: Read the executable file header and create a ing relationship based on the Program header table of the executable file.

 


The page ing mechanism of virtual memory is used to load executable files. During the ing process, the page is the smallest unit of ing. If we want to establish a ing between a physical memory and a virtual address space of the process, the length of the memory space must be an integer multiple of 4096, in addition, the starting address of the space in the room's memory and virtual process address space must be an integer multiple of 4096. If you allocate an integer multiple of 4096 for each segment (for example, if the length of a segment is 127, you also need to allocate 4096), it will inevitably cause a great waste of memory. For example, the ELF file contains three segments: seg0, seg1, and seg2. The lengths are 1988, and, respectively. According to the above method, seg0 must be allocated with one page, seg1 must be allocated with three pages, and seg2 must be allocated with one page. Then, a total of five physical pages must be allocated, corresponding to five virtual pages.

 

To solve this problem, some UNIX systems adopt a very clever method, that is, to share a physical page with the adjacent parts of each segment, and then map the physical page twice. In the same example, in fact, seg0, seg1, and seg2 only need three physical pages, page1, page2, and page3. However, page1 can be installed in combination with seg0 and seg1, page3 also works with fashion including seg1 and seg2. In this case, map page0 and page3 to the virtual address space, and map page1 to the virtual address space. In this ing mode, multiple virtual pages are mapped to one physical page. For a physical page, it may contain data of two or more segments at the same time.

 

Through the page ing mechanism above, the ing between executable files and virtual address spaces, and the ing between virtual address spaces and physical spaces are determined.

 

There is a question: as mentioned in the previous article about static links, the virtual addresses of each segment in the executable file after links have been allocated. That is to say, the program header table used to establish the ing relationship in the second step discussed here is already created at the link. I don't know. Many of the things discussed above are actually linked things, rather than loading things.

 

Finally, we will summarize the loading process of the Linux kernel ELF file:

  • Check the validity of the elf executable file, such as magic, the number of middle sections in the program header table.
  • Find the ". interp" segment of the Dynamic Link and set the path of the dynamic linker (related to the dynamic link ).
  • Map elf files, such as code, data, and read-only data, based on the description of the program header table of the elf executable file.
  • Initialize the elf process environment. For example, when a process is started, the address of the edX register should be the address of dt_fini (related to dynamic links)
  • Change the return address of the System Call to the entry of the elf executable file. The entry point depends on the program connection method. For the executable file of the static link, this program entry is the address indicated by e_entry in the file header of the ELF File. For dynamic elf executable files, the program entry point is a dynamic linker. The new program starts to run, and the elf executable file is loaded completely.
  • When a new program starts running, a new physical page is assigned whenever a page missing error occurs (it can be understood that a physical page is not associated with the actual physical page for a virtual page, resulting in a page missing error, read the missing page content (the corresponding content of the ELF File) from the disk to the memory, and set the ing between the virtual page and the physical page, and start from the virtual address of the missing page.

 

 



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.