A process is a command execution flow and its execution environment, and its execution environment is a collection of system resources, these resources are extracted in Linux
It is like a variety of data objects: process control blocks, virtual storage space, file systems, file I/O, and signal processing functions. So create a process
The process is the process of creating these data objects.
When a system call fork is called to create a process, the child process only completely copies the resources of the parent process. In this way, the child process is independent
The parent process has good concurrency, but communication between the two needs to be implemented through a dedicated communication mechanism, such as pipe, FIFO, System V
In addition, it is costly to create a sub-process through fork. You need to copy each of the resources described above. This way
It seems that fork is a system call with a very high overhead. These overhead are not required in all situations, for example, a process fork outputs
After a sub-process, its sub-process only needs to call exec to execute another execution file.
It is a redundant process (because the copy-on-write technology is adopted in Linux, the work of this step is just a virtual storage
Manage part of the replication and page table creation, but does not include physical and physical copies). In addition, sometimes a process has several independent
In the same address space, there is basically no conflict between the computing units, but in order to assign these computing units to different processing
You need to create several sub-processes, and then each sub-process calculates the computing results through a certain inter-process communication and synchronization mechanism.
Summary, this often has a lot of extra overhead, and this overhead is sometimes enough to offset the benefits of parallel computing.
This shows that the abstraction of computing units into processes is inadequate, which is why threads are introduced in many systems. In
The following describes how to call a vfork system. A vfork system call is different from a fork system call. The sharing address of the sub-process created by vfork is empty.
That is to say, the child process runs completely in the address space of the parent process, and the child process modifies any data in the virtual address space to the parent process.
Cheng siwyg. However, after a child process is created using vfork, the parent process is blocked until the child process calls exec or exit. The advantage is that
After a program is created, it is only used to call exec to execute another program because it does not reference the address space of the parent process.
The replication of address space is redundant. vfork can reduce unnecessary overhead.
In Linux, both fork and vfork call the same core function.
Do_fork (unsigned long clone_flag, unsigned long USP, struct pt_regs)
Clone_flag includes clone_vm, clone_fs, clone_files, clone_sighand, clone_pid, clone_vfork
And so on. If any bit is set to 1, the created sub-process and the parent process share the resource corresponding to this bit. Therefore, in the implementation of vfork
, Cloneflags = clone_vfork | clone_vm | sigchld, which indicates that the child process and the parent process share the address space.
Do_fork checks clone_vfork. If this bit is set to 1, the child process locks the address space of the parent process until the child process exits or
The lock is released only when exec is executed.
Some concepts of threads are briefly introduced before calling the clone system.
A thread is further abstracted based on a process. That is to say, a process is divided into two parts: a thread set and a resource set. Thread
It is a dynamic object in a process. It should be a group of independent command streams, and all threads in the process will share resources in the process. But line
Programs should have their own private objects, such as program counters, stacks, and register context.
There are three types of threads:
Kernel thread, lightweight process, and user thread.
Kernel thread:
Its creation and revocation are determined by the internal requirements of the kernel and are used to execute a specified function, which is not required by a kernel thread.
And a user process. It shares the global data of the kernel's body segment and has its own kernel stack. It can be separately scheduled
In addition, the standard kernel synchronization mechanism can be allocated separately to a single processor for running. Kernel thread scheduling is not required
Therefore, context switching between kernel threads is much faster than Context switching between processes.
Lightweight Process:
A lightweight process is a core user thread that supports multiple threads in a single process. These lightweight processes are independently
Can run on multiple processors, each lightweight process is bound to a kernel thread, and its lifecycle
This binding is valid. Lightweight processes are independently scheduled and share the address space and other resources in the process, but each lwp should
You should have your own program counters, Register sets, core stacks, and user stacks.
User thread:
The user thread is implemented through the thread library. They can be created, released, and managed without kernel involvement. The thread Library provides synchronization and
Scheduling Method. In this way, the process can use a large number of threads without consuming kernel resources, and saves a lot of system overhead. User thread
It is possible because the context of the user thread can be saved and restored without kernel intervention. Each user thread can have its own
User stack, a memory zone used to save user-level register context and status information such as signal shielding. Library saves the current
The thread stack and register content are loaded into the content of the new scheduling thread to implement scheduling and context switching between user threads.
The kernel is still responsible for process switching because only the kernel has the right to modify memory management registers. User threads are not really scheduled
Entities, the kernel does not know anything about them, but only schedules processes or lightweight processes under the user thread. These processes are then passed through the thread library function.
To schedule their threads. When a process is preemptible, all its user threads are preemptible. When a user thread is blocked, it will
Blocks The following lightweight process. If a process has only one Lightweight Process, all its user threads will be blocked.
After clarifying these concepts, we will introduce Linux threads and clone system calls.
In many operating systems that implement MT (such as Solaris and Digital UNIX), threads and processes use two data structures.
Abstract representation: the process entry and thread table entry. A process entry can point to several thread table items.
Scheduling thread. However, in Linux, task_struct is used to manage all processes/threads.
Resources between threads are shared. These resources are previously mentioned: virtual storage, file system, file I/O, and signal
Functions or even PID.
That is to say, in Linux, each thread has a task_struct, so the thread and process can use the same Scheduler for scheduling. Actually
In the Linux kernel, there is no qualitative difference between lightweight processes and processes, because the concept of processes in Linux has been abstracted into computing states and resources.
Source set, which can be shared between processes. If a task excludes all resources, it is an HWP. If a task and
Other tasks share some resources, which is lwp.
Clone system call is a system call that creates a Lightweight Process:
Int clone (INT (* fN) (void * Arg), void * stack, int flags, void * Arg );
FN is the process executed by a lightweight process, and stack is the stack used by a lightweight process. flags can be the one mentioned above.
Clone_vm, clone_fs, clone_files, clone_sighand, clone_pid combination. Clone and fork.
All of them call the core function do_fork.
Do_fork (unsigned long clone_flag, unsigned long USP, struct pt_regs );
Different from fork and vfork, clone_flag = sigchld;
Clone_flag = clone_vm | clone_vfork | sigchld;
In clone, clone_flag is provided by the user.
The following is an example of using clone.
Void * func (INT Arg)
{
......
}
Int main ()
{
Int clone_flag, ARG;
......
Clone_flag = clone_vm | clone_sighand | clone_fs |
Clone_files;
Stack = (char *) malloc (stack_frame );
Stack + = stack_frame;
Retval = clone (void *) func, stack, clone_flag, ARG );
......
}
It seems that the clone usage is similar to pthread_create. The most fundamental difference between the two is that clone creates an lwp
The core is visible and is scheduled by the core, while pthread_create is usually used to create a user thread, which is invisible to the core.
Library scheduling.
Linux pthread_create finally calls clone, pthread_create calls clone, and opens a stack as a parameter
The thread library is responsible for thread creation, synchronization, and destruction,