Linux from program to process

Last Update:2015-05-16 Source: Internet

Author: User

Tags glob

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Vamei Source: Http://www.cnblogs.com/vamei Welcome reprint, Please also keep this statement. Thank you!

How does the computer execute the process? This is the core issue of computer operation. Even if the program has been written, the program is dead. Only a living process can produce. We've learned the process from the Linux Process Foundation. Now let's look at the long journey from program to process.

A program

The following is a simple C program, assuming that the program has been compiled, generate executable file Vamei.exe.

#include <stdio.h>int glob=0;                                             /*global variable*/void Main (void) {  int main1=5;                                          /*local variable of main () */  int main2;                                            /*local variable of main () */  main2 = inner (main1);                                 /* Call inner () function */  printf ("from Main:glob:%d \ n", glob);  printf ("from main:main2:%d \ n", main2);} int inner (int inner1) {                                 /*inner1 is a argument, also local to inner () */  int inner2=10;                                        /*local variable of inner () */  printf ("from Inner:glob:%d \ n", glob);  return (INNER1+INNER2);}

(Choosing which language or specific syntax is not the key, most languages can write programs like the one above.) Readers of the Python tutorial can also use Python's function structure and print to write a similar Python program. Of course, it can be c++,java,objective-c and so on. The reason for choosing C is that it is a Unix-born language. )

The inner () function is called in the main () function. Inner () is called once in printf () to output. Finally, the printf () was performed two times in Main ().

Notice the scope of the variable. Simply put, variables can be divided into global variables and local variables. Variables declared outside of all functions are global variables, such as Glob, which can be used at any time. Variables defined within a function are local variables and can only be used within the scope of the function (range), for example, we cannot use the MAIN1 variable declared in the main () function while working at inner (), and in main () We cannot use inner () The INNER2 variable declared in the function.

Don't worry too much about the specific features of the program. The point is the process of running this program. For the program's running process, and the scope of each variable:

Run the process

Process space

To learn more about how the above program works, we also need to know how the process uses memory. When a program file runs as a process, the process obtains space in memory. This space is the process of its own small house.

Each process space is divided into different regions as follows:

Memory space

The text area is used to store instructions (instruction), indicating the operation of each step. Global data is used to hold the globals, which are used to hold local variables, and the heap (heap) is used to hold dynamic variables (the variable). The program uses the malloc system call to open space directly from memory for dynamic variable. Text and global data are determined at the beginning of the process and remain fixed throughout the process.

Stack (stack) is a frame (stack frame). When a program calls a function, such as the inner () function in the main () function, the stack grows downward by one frame. The parameters and local variables of the function are stored in the frame, and the return address of the function is returned. At this point, the computer transfers control from main () to inner (), and the inner () function is in the active state. The frame at the bottom of the stack, along with the global variables, forms the current environment (context). The activation function can invoke the required variable from the environment. A typical programming language allows you to use only the frame at the bottom of the stack, rather than allowing you to invoke other frames (this also conforms to the "advanced out" feature of the stack structure.) But there are also some languages that allow you to call the rest of the stack, which is equivalent to allowing you to invoke the local variables declared in main (), such as Pascal, when you run the inner () function. When the function further invokes another function, a new frame continues to be added below the stack, and control is transferred to the new function. When the activation function returns, the frame pops up from the Stack (POPs, reads and deletes from the stack), and gives control to the instruction that the return address points to (such as returning from the inner () function, and continuing to perform the operation assigned to main2 in Main (), based on the return address of the record in the frame.

Is the change of the stack in the process of operation. Arrows indicate the growth direction of the stack. Each block represents a frame. At first we had a frame for main (), and as we called inner () we added a frame for inner (). When inner () returns, we only have a frame of main () again, until the last main () returns, its return address is empty, so the process ends.

Stack changes

In the process of running a process, control is constantly transferred between functions by calling and returning functions. The process can be called at the time of the function, the original function of the frame is saved in the state we left, and for the new function to open the required frame space. When the calling function returns, the space occupied by the function's frame is emptied as the frame pops up. The process returns to the state saved in the frame of the original function and continues execution based on the instruction that the return address points to. The above process continues, the stack grows or decreases until main () returns, the stack is completely emptied, and the process ends.

When malloc is used in a program, the heap grows upward, and its growing portion becomes the space that malloc allocates from memory. The space created by malloc will persist until we use the free system to release it, or the process ends. A classic error is memory leakage, which means that we do not release heap space that is no longer being used, causing the heap to grow and the memory available space to be reduced.

The size of the stack and heap increases or decreases as the process runs. There is no memory available when the stack and heap grow to meet each other, that is, when the blue area in the memory space diagram disappears completely, unused. The process will have stack overflow (stack overflow) errors, causing the process to terminate. In modern computers, the kernel typically allocates enough blue areas for the process, and if the cleanup is timely, the stack overflow is easily avoided. Even so, the memory load is too large, there may be a stack overflow situation. We need to increase the physical memory.

Stack overflow can be said to be the most famous computer error, so it has the IT website (stackoverflow.com) as the name.

In high-level languages, the details of these memory management are opaque to the user. When programming, we just need to remember the scope of the variables in the previous section. But when we want to write complex programs or debug, we need relevant knowledge.

Process additional Information

In addition to the above information, each process also includes some process extensions, including Pid,ppid,pgid (refer to Linux process basics and Linux process relationships) to illustrate process identity, process relationships, and other statistical information. This information is not stored in the process's memory space. The kernel allocates a variable (task_struct struct) for each process in the kernel's own space to hold the above information. The kernel can know the status of a process by looking at additional information about each process in its own space, rather than going into the process itself (as if we were able to know who the owner of the room was, without opening the door). The additional information for each process has a location dedicated to saving the received signal (as we said in the Linux signal base "Mailbox").

Fork & Exec

Now we have a deeper understanding of the mechanics of Fork and Exec (reference to Linux process basics). When a program calls fork, it is actually copying the above memory space, including text, global data, heap, and stack, to form a new process and create new additional information in the kernel for the process, such as a new PID, And Ppid is the PID of the original process). Thereafter, two processes continue to run separately. The new process and the original process have the same running state (same variable value, same instructions ...). We can only differentiate between the two through the additional information of the process.

When the program calls exec, the process empties its memory space of text, global data, heap, and stack, and rebuilds the text, global data, heap, and stack (both heap and stack sizes are 0) based on the new program files. and start running.

(Modern operating systems, in order to be more efficient, have improved the specific mechanisms for managing fork and exec, but there is no logical difference.) See Linux kernel related books for specific mechanisms)

This article has been written to integrate a lot of things, so some are long. This article is primarily conceptual, and many of the details vary depending on the language and platform and the compiler, but in general, the above concepts apply to all computer processes, whether Windows or UNIX. More in-depth content, including threading (thread), interprocess communication (IPC), etc., depends on what is described here.

Summarize

function, the scope of the variable, global/local/dynamic variables

Global data, text,

Stack, stack frame, return address, stack overflow

Heap, malloc, free, memory leakage

Process additional information, task_struct

Fork & Exec

Linux from program to process

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More