Understand the depth of a process by executing an executable file.

Source: Internet
Author: User

Understand the depth of a process by executing an executable file.

I don't know if you have ever thought about it. We put it on a disk (I used to think that drive C for Windows is the primary storage, and drive DEF is the disk. Haha, there should be no ignorant people like me) how can an executable file (or application) on be executed, and why do we feel like the addresses of some variables in the program are similar to those in different programs when writing programs? What exactly does the address correspond? Is it the disk address where the executable file is located? Next I will explain in detail how to execute an executable file on the Linux platform (the same is true for Windows, but the command mode is changed to the graphical mode.

In Linux, when we open the shell, we have created a new process that runs the shell application. When you enter the name of an executable target file in shell, shell will use the fork () function to create a new process and call execve () in this new process () function to load and execute this executable file.

I am going to explain in detail how the execve () function works. For example, how does it copy the target file on the disk to the primary storage for the CPU to run? What is the address we see in the program? We will analyze these problems step by step.

First of all, because the execve () function runs in the sub-process of the shell process, and the sub-process will certainly copy (in fact it is not a copy, or the process is too bloated to be designed, it is a mechanism called copy at write time. Many existing content of the parent process must be deleted.

Then it starts ing (seeing whether the ing has come up with something called function ing in mathematics, which is essentially the same) the content in our executable file. When talking about ing, it must be X ------> Y, now Y is our executable file. What about X? Let's add a bit of knowledge in the process before we can talk about X. Each process has a page table, and there are many items in the page table. each item is called a page table. (To simplify the complexity of the problem, let's assume that Linux is a level-1 page table ), in the operating system, the size of a page or physical block is usually 4 kb (corresponding to the 12-bit page address ), therefore, in a 32 operating system, you only need to save 2 ^ 20 page table items to indicate the address ranges from 0x00000000 to 0xffffffff. The last 12 digits of this address are intra-page addresses, the address we see in the program is this address, it is not the physical address corresponding to our program. Remember, this address is not the actual disk or memory address, but a virtual address. If I still don't quite understand it, I will understand it after all.

At this point, you should first understand it, so as not to be confused. At the beginning, we just talked about the page table items in the process. The number of each page table item from the start to the end is 0x00000-0xfffff (a total of 2 ^ 20 items, you can draw a picture ), this page table item has two main parts. The first part is used to point to the disk's physical block or memory block, the second part indicates whether the pointing block is on the disk or memory or this part is useless.

Now we can say what X is, that is, a virtual address! After talking about X and Y, there are also ing rules. For text blocks, data blocks, stacks, and heaps in our programs, they correspond to different virtual addresses in Linux, it is also fixed and the same for all programs. This can also explain why different variables in different programs sometimes have similar addresses, because their virtual addresses are from 0x00000000 --- 0 xffffffff, so when their variables are stored in the stack, the corresponding virtual address is also very close.

After the ing, execve () calls the startup code. The startup code will call the main () function. Do you think that the executable target is still on the disk? How is it copied to the memory and executed by the CPU? Therefore, when the startup code passes the virtual address of the main () function to the CPU, the CPU resolves the virtual address and finds that there is no page or physical block corresponding to the main () in the memory, then, the CPU finds the disk location of the executable file through the page table items in the process and copies the block on the disk to the memory, so that the CPU can smoothly execute our program.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.