Linux process Practice (1)--linux Process Programming Overview

Source: Internet
Author: User

process VS. Program

What is a program?

A program is a collection of instructions that accomplish a particular task.

What is a process?

[1] From the user's point of view: a process is a process of execution of a program

[2] from the core of the operating system: the process is the operating system allocated memory, CPU time slices and other resources of the basic unit.  

[3] process is the smallest unit of resource allocation

[4] Each process has its own independent address space and execution state.

[5] Multitasking operating systems like UNIX enable many programs to run simultaneously, and each running program forms a process

Process data Structures

The process consists of three parts: aPCB, a program segment, and a data segment .

Process Control block PCB: Used to describe process conditions and to control all information required for the process to run.

Code snippet: is a program code snippet that can be executed by the process scheduler on the CPU in a process.

Data segment: The data segment of a process, which can be the raw data processed by the process's corresponding program, or the intermediate or final data resulting from the execution of the program.

Differences in processes and procedures

The process is dynamic (the only sign that the process exists: the PCB, the CPU controls the process through the PCB), the program is static

The life cycle of a process is relatively ephemeral, and the program is permanent.

A process can only correspond to one program, and one program may correspond to multiple processes.

Process Tri-State


The process is ready for creation and is executed as a result of scheduling, and is re-ready because the time slice is exhausted;

Blocking in execution due to I/O requests;

I/O complete and ready

Note: The block cannot be executed directly and must go into a ready state.

Process status in the Linux kernel


Operational status (task_running)

interruptible sleep status (task_interruptible)

non-disruptive sleep state (task_uninterruptible)

paused State (task_stopped)

Zombie State (Task_zombie)

Process scheduling

Tasks for process scheduling

field information of the storage processor

Select a process by an algorithm

Assigning the processor to a process

Process scheduling algorithm

1. First Come first service algorithm

2. Short process first algorithm

3. Time Slice rotation algorithm

4. Priority scheduling algorithm

5. Multilevel Feedback Queue scheduling algorithm

terminology related to process programming

Process Flags:

Each process is assigned a unique numeric number, which we call the process identifier, or it is called a PID directly.

is a positive integer with a value range from 2 to 32768

When a process is started, it will pick the next unused numbered number as its own PID

Process Number 1th is a special process init

Process Idle Process # NO. 0

Explanation of 0,1:

Process 0:linux The first process created in boot, and after the system is loaded, it evolves into process scheduling, switching and storage management processes;

The process 1:init process, created by 0 processes, completes the initialization of the system. Is the ancestor process of all other user processes in the system;

Process creation

The names and formats of the process creation primitives that are provided by different operating systems vary, but after the creation of the process primitive, the operating system does much the same, including the following:

(1) Assign an internal identity to the newly created process and establish the process structure in the kernel.

(2) Copy the environment of the parent process

(3) Allocating resources to processes, including all the elements required by the process image (Programs, data, user stacks, etc.),

(4) Copy the contents of the parent process address space into the process address space.

(5) Set the status of the process to ready and insert the ready queue.

Process Revocation

When the process terminates, the operating system does the following:

(1) Turn off soft interrupt: No soft interrupt signal is processed because the process is about to terminate;

(2) Recycling resources: Release all resources allocated by the process, such as closing all open files, releasing the corresponding data structure of the process, etc.;

(3) Write accounting information: The accounting data generated by the process during operation, including various statistics of the process runtime, is recorded into a global accounting file;

(4) Set the process to a zombie state: Send the child process dead soft interrupt signal to the parent process, send the terminating information status to the specified storage unit;

(5) Transfer process scheduling: Because the CPU has been released at this time, the process needs to be scheduled for CPU redistribution.

Fork System Call

Copy a process image

the child processes obtained by using the fork function inherit the entire process's address space from the parent process, including: process context, process stack, memory information, open file descriptor, signal control settings, process priority, process group number, current working directory, root directory, resource limit, control terminal, and so on.

The difference between a child process and a parent process:

1, the parent process sets the lock, the child process does not inherit

2, the respective process ID: parent-child process ID is different

3. The pending warning of the sub-process is cleared;

4. The pending signal set of the sub-process is set to empty;

Fork system Call

#include <unistd.h>pid_t fork (void);

Create a child process

return value:

If a child process is created successfully, the child process ID is returned for the parent process

If a child process is created successfully, the return value is 0 for the child process

If 1 indicates that the creation failed

How to understand the fork function one call, two times back?

The nature of the problem is that two returns are returned in the respective process space.

The child process and the parent process each have their own memory space (fork: Code snippet, data segment, stack segment, copy of the PCB Process control block).


Deep understanding: Why does Hello world Print 8 times int main (int argc, char *argv[]) {    signal (SIGCHLD, sig_ign);    Fork ();    Fork ();    Fork ();    cout << "Hello World" << Endl;    Exit (0);}
Example: generating n child process int main (int argc, char *argv[]) {    signal (SIGCHLD, sig_ign);    int processcount;    Cin >> Processcount;    for (int i = 0; i < Processcount; ++i)    {        pid_t pid = fork ();        if (PID < 0)            err_exit ("fork Error");        else if (PID = = 0)        {            cout << "Child ..." << Endl;            Exit (0);        }    }    Exit (0);}
Write-time copy (copy on write)

Cow First Glimpse:

In a Linux program, fork () produces a child process that is exactly the same as the parent process, but the child process will then be called by the Exec system, and for efficiency reasons, the "copy-on-write" technique is introduced in Linux, that is, when the content of the segments of the process space is changed, The contents of the parent process are copied to the child process .

So the physical space of the child process has no code, how to fetch instructions to execute the EXEC system call?

Before exec after the fork two processes use the same physical space (memory area), the child process of the code snippet, data segment, stack are points to the parent process of the physical space, that is, the two virtual space is different, but its corresponding physical space is the same . when the parent-child process changes the corresponding segment of the behavior occurs, and then the corresponding segments of the child process to allocate physical space, if not because of the exec, the kernel will give the child process data segment, stack segment allocation of the corresponding physical space (so that both have their own process space, non-impact), The code snippet continues to share the physical space of the parent process (the code is exactly the same). And if it is because of exec, the code snippet for the child process will also be assigned a separate physical space because of the different code executed by the two.

On the internet there is a detail problem is that after fork the kernel will be placed in front of the queue, so that the child process first, so that the parent process will not cause the execution of the copy, and then the child process exec system calls, because of meaningless replication resulting in a decrease in efficiency.

Cow Details:

Now there is a parent process P1, this is a subject, then it is the soul of the body. Now in its virtual address space (with the corresponding data structure representation) There are: Body segment, data segment, heap, stack of the four parts, corresponding to the kernel to allocate the respective physical blocks for these four parts. That is: Body segment block, data segment block, heap block, stack block.

1. Now P1 uses the fork () function to create a subprocess P2 for the process,

Kernel:

(1) Copy the body of the P1, data segment, heap, stack of the four parts, note that its contents are the same.

(2) The allocation of physical blocks for these four parts, P2: Body segment->P1 The body of the physical block, in fact, is not assigned to P2 body block, so that the body of P2 to point to P1 body block , the data segment->p2 its own data segment block (for which the corresponding block is allocated), Heap->p2 own heap blocks, stacks->p2 their own stacks.

As shown: the left-to-right direction arrows represent the copied content.

2. Copy-on-write technology: The kernel creates virtual spatial structures only for newly generated child processes that replicate in the virtual space structure from the parent process, but do not allocate physical memory for those segments , share the physical space of the parent process, and change the behavior of the corresponding segment when the parent-child process occurs , and then allocate the physical space for the corresponding segment of the child process.

3. Vfork (): This is a more popular approach, the virtual address space structure of the kernel even child process is not created, directly share the virtual space of the parent process , of course, this practice yielded shared the physical space of the parent process

Summary: A process is a subject, then it has the soul and body, the system must create the corresponding entity, soul entity and physical entity for implementing it. Both of them have corresponding data structures in the system, and the physical entity embodies its physical meaning.

The traditional fork () system call directly copies all the resources to the newly created process. This implementation is too simple and inefficient because the data it copies may not be shared, and worse, if the new process intends to execute a new image immediately, all copies will be wasted. The fork () of Linux is implemented using the write-time copy (Copy-on-write) page. Write-time copying is a technique that can postpone or even eliminate copy data. The kernel does not replicate the entire process address space at this time, but instead lets the parent process and child processes share the same copy. Data is copied only when it needs to be written, so that each process has its own copy. In other words, the replication of a resource only takes place when it needs to be written, and before that, it is shared only as read-only . This technique enables a copy of the page on the address space to be deferred until the actual write occurs . When the page is not written at all, {For example: Call EXEC () immediately after fork ()} They do not have to be duplicated. The actual cost of fork () is to copy the page table of the parent process and create a unique process descriptor for the child process . In general, an executable file is run immediately after the process is created, which avoids copying large amounts of data that are not used at all (often containing dozens of trillion of data in the address space). This optimization is important because UNIX emphasizes the ability of the process to execute quickly. One thing to add: Linux cow is not necessarily associated with exec

PS: In fact, cow technology is not only used in Linux process, other such as C + + string in some IDE environment also support cow technology, namely:

String str1 = "Hello World"; string str2 = str1;

Then execute the code:

str1[1]= ' q '; str2[1]= ' W ';

At the beginning of the two statements, STR1 and STR2 store the data address is the same, and after modifying the content, STR1 address changed, and STR2 address is the original, this is the application of cow technology in C + +;


[Attached]-view the maximum number of processes that the system can support

Cat/proc/sys/kernel/pid_max

Linux Process Practice (1)--linux process Programming Overview

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.