Linux from program to process

Source: Internet
Author: User
Tags glob variable scope

Author: vamei Source: http://www.cnblogs.com/vamei welcome reprint, please also keep this statement. Thank you!

 

How does a computer execute processes? This can be said to be the core issue of computer operation. Even if we have already compiledProgramBut the program is a dead text. Only a living process can generate. We haveLinux Process BasicsProcess Overview. Now let's take a look at the long journey from the program to the process.

 

1. A program

The following shows a simple C language program. We assume that the program has been compiled and translated into an executable program file vamei.exe.

# Include <stdio. h> Int Glob = 0 ; /*  Global variable  */  Void Main ( Void  ){  Int Main1 = 5 ; /*  Local variable of main ()  */    Int Main2; /* Local variable of main ()  */  Main2 = Inner (main1 ); /*  Call inner () function  */  Printf (  "  From Main: Glob: % d \ n  "  , Glob); printf (  "  From Main: main2: % d \ n  "  , Main2 );} Int Inner ( Int Inner1 ){ /*  Inner1 is an argument, also local to inner ()  */    Int Inner2 = 10 ; /*  Local variable of inner ()  */  Printf (  "  From inner: Glob: % d \ n  " , Glob );  Return (Inner1 + Inner2 );} 

(Selecting a language or specific syntax is not critical. Most languages can write similar programs. Readers who are reading the python tutorial can also use the python function structure and print to write a similar Python program. Of course, it can also be c ++, Java, objective-C, and so on. The C language is used because it is a UNIX language .)

 

In the main () function, we call the inner () function and make a printf () operation within the inner range for output. After returning from this function, we performed printf () twice in the range of main ().

Pay attention to the scope of each variable. In short, variables can be dividedGlobal VariablesAndLocal variable. Variables declared outside all functions are global variables, such as glob, which can be used at any time. The variables defined in the function are local variables.Within the range of the FunctionFor example, we cannot use the main1 variable declared in the main () function when working in inner (), but we cannot use inner () in main () the inner2 variable declared in the function.

We don't care about the specific functions and results of this program. We are concerned about the running process of this program. For the running process of the program and the scope of the variables:

Running Process

2. process space

To further understand the running of the above program, we also need to know how the process uses the memory. When the program file runs as a process, the process gets space in the memory (the small room of the process itself ). Each process space is divided into different regions as follows:

Memory space

The text area is used to store commands (Instruction) to tell the program each step of operations. Global data is used to store global variables, stack is used to store local variables, and heap is used to store dynamic variables (dynamic variable. The program uses the malloc system call to directly open up space for dynamic variable in memory ).TextAndGlobal DataThe process was identified at the beginning and maintained throughout the processFixed size.

 

Stack(Stack) To Stack frame . When a program calls a function, for example, calling the inner () function in the main () function, stack will increase a stack frame. Stack frame Parameters And Local variable And Return address(Return address) . In this case, the computer transfers control from main () to inner (), and the inner () function is in Activate (Active) status. The frame and global data at the bottom of the stack constitute the current environment (context ). The activation function can call the required variables. Typical Programming Language You can only use the frame at the bottom of the stack, rather than calling other frames (this also conforms to the "advanced and later" Features of the stack structure. However, some languages allow you to call other parts of the stack, which is equivalent to allowing you to call the local variables declared in main () when running the inner () function, such as Pascal ). When a function calls another function, a new frame is added to the lower part of the stack, and the control is transferred to the new function. When a function is activated Return From the stack Pop-up ( Pop (Read and delete) the frame, and give control to the return address based on the return address recorded in the frame. Directive (For example, return from the inner () function and continue the operation assigned to main2 in main ).

Is a stack change in the running process, the arrow represents the direction of stack growth, each block represents a stack frame. At the beginning, we had a frame serving main (). With the call of inner (), we added a frame for inner. When inner () returns, we only have the frame of main () again until the final main () returns. The return address is null, so the process ends.

Stack changes

When a process is running, it calls and returns a function,ControlConstantly shifting between functions. When a process calls a function, the original function's stack frame stores the status when we exit and opens the required stack frame space for the new function. When a function is returned, the space occupied by the stack frame of the function is cleared as the stack frame pops up. The process returns to the State saved in the stack frame of the original function again, and continues to execute according to the instruction indicated by the returned address. The above process continues, the stack continues to grow or decrease, until the return of main (), the stack is completely cleared, and the process ends.

 

When malloc is used in the program,Heap(Heap)YesThe increasing part becomes the space allocated by malloc from memory. The space opened by malloc will always exist until we release it by calling the free system or the process ends. A classic error is:Memory leakage(Memory leakage), That is, we do not release the heap space that is no longer used, resulting in heap increasing and memory available space decreasing.

The size of the stack and heap increases or decreases as the process runs. When the stack and heap grow to the same levelBlue Area(Unused area) when the process disappears completelyStack Overflow(Stack Overflow), Causing Process Termination. In modern computers, the kernel usually allocates enough blue areas to the process. If we clean up the process in real time, stack overflow can be avoided. However, when performing some matrix operations, Stack Overflow may still occur due to the large memory required. One solution is to increase the memory space allocated to each process by the kernel. If the problem persists, we need to increase the physical memory.

Stack overflow can be said to be the most famous computer error, so it website (stackoverflow.com) is named.

 

In advanced languages, these memory management details are not transparent to users. During programming, we only need to remember the variable scope in the previous section. But when we want to write complex programs or debug programs, we need relevant knowledge.

 

3. ProcessAdditional information

In addition to the above information, each process also includes some additional process information, including PID, ppid, pgid(Refer to Linux Process basics and Linux Process relationships)Is used to describe the identity, relationship, and other statistical information of a process. This information is not stored in the memory space of the process. The kernel allocates a variable (Task_structStruct) To save the preceding information. The kernel can know the process overview by viewing the additional information of each process in its own space, instead of entering the space of the process itself (just as we can know who the house owner is through the house card, rather than opening the door ). The additional information for each process has a location specifically used to save the received signal (as we mentioned in the Linux signal base "Mailbox ").

 

4. Fork & Exec

Now, we can have a deeper understandingForkAndExec(Refer to Linux Process basics). When a program calls fork, it actually refers to the above memory space, including text, global data, heap and stack, andCopyCreate a new process and create additional information for the improvement process in the kernel (for example, the new PID, ppid is the PID of the original process ). Then, the two processes continue to run separately. The new process and the original process have the same running status (the same variable value, the same instructions ...). We can only distinguish the two through additional information of the process.

When the program calls exec, the processClearText, global data, heap, and stack in the memory, andReconstructionText, global data, heap, and stack (both heap and stack size are 0) and start running.

(Modern operating systems have improved the specific mechanism for managing fork and exec to be more efficient, but logically there is no difference. For details about the mechanism, see Linux kernel related books)

 

This article integrates many things, so it is somewhat long. This articleArticleIt is mainly conceptual, and many details will vary according to the language and platform, or compiler, but in general, the above concepts apply to all computer processes (either Windows or UNIX ). More in-depth content, includingThread(Thread ),Inter-process communication(IPC) and so on, all rely on the content described here.

 

Summary:

Function, range of variables, global/local/dynamic variables

Global data, text,

Stack, stack frame, return address, Stack Overflow

Heap, malloc, free, memory leakage

Process additional information, task_struct

Fork & Exec

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.