Process-Initial impressions

Last Update:2016-04-18 Source: Internet

Author: User

Tags terminates

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Before there is a process address space virtualization concept, all programs have to be real deal aware of their allocations in physical memory (programmer handwriting allocation!!!). If the program is small and few, but also can make the management, but, in the face of the actual multi-program, the general program, have to the memory management and program writing to separate, although doing so "a little 1" to reduce efficiency.

Using Physical Address:

Using Virtual Address:

One, address virtualization and process private address space

The address space of the process is private, and the name is: private address space. What is private, which is related to virtualization. The operating system abstracts the concept of memory address (virtual address) from the implementation (physical address) of the memory address, so that programmers can not consider the code they write, variables, and other data in the terminal when the specific location of the physical memory address. When writing a program, we don't have to worry about allocating physical memory, or worrying about whether different processes are conflicting with the same address value 2 operation, resulting in inconsistencies in the data.

? Figure one: Page table and address space conversion

? Figure II: Address space mapping for different processes

1.1 Virtual Memory
1.1.1 Why

How Does everything Fit?

Memory Management:

How to Protect:

How to share:

1.1.2 Solution

1.1.3 Benefit

Simplifying Linking and loading:

Memory Protection:
. Extend PTEs with permission bits
. Page Fault handler checks these before remapping
? If violated, send process SIGSEGV (Segnmentation fault)

Second, the UNIX operation process system call

2.1 Get Process ID

The Getpid function and the Getppid function: The Getpid function returns the PID of the process that called the function to return its parent process (the process that created the process that called the function).

#include<sys/types.h>#include<unistd.h>pid_t getpid(void);pid_t getppid(void);

pid_t is defined in types.h and is of type int.

2.2 Creating and terminating processes

1) termination. The process terminates for three different reasons:

A signal is received that the default behavior of the signal is to terminate the process.

Returned from the main program.

Call the Exit function .

#include<stdlib.h>voidexit(int status);

The exit function terminates the process with the status exit state.

2)fork function . The parent process creates a new run child process by invoking the system call fork function.

#include<sys/types.h>#include<unistd.h>pid_t fork(void);

The newly created child process is almost but not exactly the same as the parent process. The child process obtains the same (but separate) copy of the parent process's user-level virtual address space (but independent ) of 4, including text, data, and BSS segments, heaps, and user stacks. The child process also obtains the same copy as any open file descriptor of the parent process, which means that when the parent process calls fork, the child process can read and write to any file that is open in the parent process. The biggest difference between a parent process and a newly created child process is that they have different PID.

If you can pause both processes immediately after the fork function returns in the parent and child processes, we see that the address space for each process is the same. Each process has the same user stack, the same local variable value, the same heap, the same global variable value, and the same code.

The fork function is called once by the parent process, returned two times: once back in the parent process, the return value is the PID of the child process, one time to return to the newly created child process, the return value is 0. The return value provides an explicit way to tell whether a program is executing in a parent or child process. The parent and child processes run concurrently after they are returned.

2.3 Loading and running the program

The EXECVE function loads and runs a new program in the context of the current process.

#include<unistd.h>int execve(constcharconstcharconstchar *envp[]);//如果成功，则不返回，如果错误，则返回-1。

Parameter list

Environment List

The EXECVE function loads and runs the executable target file filename, with the parameter list argv and the environment variable list ENVP. The function is to find the executable file according to the specified file name and use it to replace the contents of the calling process, in other words, executes an executable file inside the calling process. The executable file here can be either a binary file or a script file that can be executed under any Linux.
Unlike the general situation, the function does not return after successful execution, because the entity that invokes the process, including the code snippet, the data segment, and the stack have been replaced by new content, leaving only some surface information, such as the process ID, to remain the same, rather like the "Jinchantuoqiao" in the "36 gauge". It looks like an old shell, but it's already infused with a new soul. Only if the call fails, they will return a-1, followed by a 5 from the point of invocation of the original program.
After EXECVE loads the filename, it calls the startup code. Start the code set stack and pass control to the main function of the new program, which has the following form of prototype
int main(int argc , char **argv, char **envp)
or equivalently,
int main(int argc , char *argv[] , char *envp[])j
When main starts executing in a 32-bit Linux process, the user stack looks like the organizational structure shown.

Let's take a look at the top of the stack (the low address) from the bottom of the stack (high address). The first is the parameters and environment strings, which are stored sequentially in the stack, one by one, without separation. The stack is immediately followed by a null-terminated array of pointers, where each pointer points to an environment variable string in the stack. The global variable environ points to the first envp in these pointers [0]. Immediately following the array of environment variables is the null-terminated argv[] array, where each element points to a parameter string in the stack.
At the top of the stack are the 3 parameters of the main function:
1) ENVP, which points to envp[] array
2) argv, which points to argv[] array
3) argc, which gives the number of argv[] non-null pointers

In Linux, there is a library function 6 to wrap the system call Execve function.

2.4 Other functions

Refer to "in-depth understanding of computer Systems" chapter 8th \ Section 8.4 Process Control

In fact, the current operating system to address mapping optimization has made the virtual address translation process almost no time consuming.
It is normal for some data in different processes to have the same address value because the virtual address space of the process is the same, and we do not have to worry about memory collisions, because, although some of these processes have the same address value, they are implemented (physical memory) by the operating system to schedule physical memory. These same virtual address values are either mapped to different physical memory addresses by the operating system, or are mapped to the same physical memory address, and the control of access to this shared physical memory is added to the process's respective page table entries.
Each process has a unique positive "process ID" (PID).
Here is the write-time copy technology, after the child process has just been created, the Child Process page table and other data is completely copied from the parent process, so the child process and the parent process share the contents of physical memory, but when one of the processes to overwrite the contents of physical memory, it will overwrite the shared data copied to another area of physical memory , and change the corresponding page table entry so that it maps to the physical area after the change.
Now we should understand how Linux executes the new program, and whenever a process thinks that it can't make any contribution to the system or user, he can use the last bit of heat, call any exec, and make himself reborn with a new face, or, more generally, if one process wants to execute another program , it can fork out a new process, after the fork call returns, the parent, child process executes the next judgment statement, by judging the process number (PID) to determine the parent, child process, and then the child process calls any exec, the parent process will skip the Exec execution statement, and then proceed, The contents of the parent process, such as the code, have not changed. However, the child process loads other programs through exec, so they will execute different code directives concurrently, after the parent and child processes execute the fork call to judge the process number (PID) statement after the return point. This looks like a new process is generated by executing the application, and the new process can be a process for any program. In fact, the second situation is so pervasive that Linux is optimized for it, and we already know that fork will copy all the contents of the calling process into the newly generated subprocess, which is time consuming, and if we call exec immediately after the fork is finished, These hard copies will be erased immediately, which looks very uneconomical, so people have designed a "copy-on-write (copy-on-write)" technique that does not immediately replicate the contents of the parent process after the fork is finished, but instead copies the part that is used when it is actually used. So if the next statement is exec, it will not be useless, and it will improve efficiency.
EXEC Family library functions?

Process-Initial impressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More