Reprinted: http://www.cnitblog.com/tarius.wu/articles/2277.html
Reprinted: http://www.fansoo.com/blog/2011/kernel-threads-lightweight-processes-threads-and-linuxthreads-library-users/
Kernel thread
The kernel thread runs only in the kernel state and is not dragged down by the user State context.
- Processor competition: it can compete for processor resources throughout the system;
- Resource usage: the only resource used is the space to keep registers during kernel stack and context switching.
- Scheduling: The scheduling overhead may be about as expensive as the process itself
- Synchronization efficiency: Resource synchronization and data sharing are lower than the data synchronization and sharing of the entire process.
Lightweight Process
A lightweight process (lwp) is a user thread built on the kernel and supported by the kernel. It is a highly abstract kernel thread. Each Lightweight Process is associated with a specific kernel thread. Kernel threads can only be managed by the kernel and scheduled like normal processes.
A lightweight process is created by a clone () system call. The parameter is clone_vm, that is, it shares the process address space and system resources with the parent process.
Difference from common processes: lwp has only one minimum execution context and the statistical information required by the scheduler.
- Processor competition: because it is associated with a specific kernel thread, it can compete for processor resources throughout the system
- Resource usage: Share the process address space with the parent process
- Scheduling: Like a common process
User thread
A user thread is a thread library fully established in the user space. The creation, scheduling, synchronization, and destruction of user threads are all completed in the user space without the help of the kernel. Therefore, such threads are extremely low-consumption and efficient.
- Processor competition: Pure user threads are established in the user space and are transparent to the kernel. Therefore, the processes to which they belong are independently involved in the competition of processors, all threads of a process compete for the resources of the process.
- Resource usage: Share the process address space and system resources with the process.
- Scheduling: the thread library implemented in the user space is scheduled within the Process
Linux thread Library
Linuxthreads is the thread library of the user space and adopts the thread-process 1-to-1 model (that is, a user thread corresponds to a lightweight process, A lightweight process corresponds to a specific kernel thread). Scheduling is equivalent to scheduling. Scheduling is completed by the kernel, the creation, synchronization, and destruction of threads are completed by the non-core thread Library (linuxthtreads has been bound to glibc for publishing ).
In linuxthreads, a dedicated management thread processes all the thread management tasks. When the process calls pthread_create () to create a thread for the first time, it will first create (clone () and start the management thread. When the subsequent process pthread_create () creates a thread, it is the sub-thread of the Management thread as the caller of pthread_create (), and the user thread is created by calling clone, and records the ing between Lightweight Process numbers and thread IDs. Therefore, user threads are actually sub-threads used to manage threads.
Linuxthreads only supports pthread_scope_system scheduling. The default scheduling policy is sched_other.
The user thread scheduling policy can also be modified to sched_fifo or sched_rr. The priority is 0-99, while sched_other only supports 0.
- Sched_other time-based scheduling policy,
- Sched_fifo real-time scheduling policy, first served
- Sched_rr real-time scheduling policy, time slice Rotation
Sched_other is a common process, and the last two are real-time processes (general processes are common processes, and there are very few opportunities for real-time processes in the system ). Sched_fifo and sched_rr have a higher priority than all sched_other processes. Therefore, as long as they can run, all sched_other processes will not be able to run until they have finished running.
Appendix: Basic knowledge (threads and processes)
As defined in textbooks, processes are the smallest unit of resource management, and threads are the smallest unit of program execution. In the operating system design, the main purpose of evolution from a process to a thread is to better support SMP and reduce (process/thread) Context switching overhead.
Regardless of the method, a process requires at least one thread as its instruction execution body, and processes manage resources (such as CPU, memory, and files ), the thread is allocated to a CPU for execution. Of course, a process can have multiple threads. At this time, if the process runs on an SMP machine, it can use multiple CPUs to execute each thread at the same time to achieve the maximum degree of parallelism, to improve efficiency. At the same time, even on a single CPU machine, the multi-threaded model is used to design the program, just as the multi-process model was used to replace the single process model in the past, make the design more concise, function more complete, and program execution more efficient, for example, using multiple threads to respond to multiple inputs, at this time, the multi-threaded model can also implement the functions implemented by the multi-process model. Compared with the latter, the context switching overhead of the thread is much lower than that of the process.
In terms of semantics, the function of responding to multiple inputs at the same time actually shares all resources except the CPU.
For the two main Meanings of the thread model, two thread models, core-level thread and user-level thread, are developed respectively. The classification criteria mainly refer to whether the thread scheduler is inside or outside the core. The former is more conducive to concurrent use of multi-processor resources, while the latter is more concerned with context switching overhead. In current commercial systems, the two are usually used together to provide core threads to meet the needs of SMP systems, it also supports implementing another thread mechanism in the user State using the thread library. At this time, a core thread becomes the dispatcher of multiple user State threads at the same time. Just like many technologies, "hybrid" usually brings higher efficiency, but it also brings more implementation difficulties. Out of the "simple" design philosophy, linux has no plans to implement a hybrid model from the very beginning, but it adopts a "hybrid" approach in implementation ".
In the specific implementation of the thread mechanism, threads can be implemented in the operating system kernel or outside the kernel. The latter obviously requires at least processes in the kernel, the former generally requires that the process be supported in the nucleus. The core-level thread model obviously requires the support of the former, while the user-level thread model is not necessarily implemented based on the latter. This difference is brought about by the different standards of the two classification methods.
When the kernel supports both processes and threads, the "many-to-many" model of the thread-process can be implemented, that is, a thread of a process is scheduled by the kernel, at the same time, it can also be used as the scheduler of the user-level thread pool, and select the appropriate user-level thread to run in its space. This is the "mixed" thread model mentioned above, which can meet the needs of a multi-processor system and minimize the scheduling overhead. Most commercial operating systems (such as Digital UNIX, Solaris, and IRIX) Use a thread model that fully implements the posix1003.1c standard. Threads implemented outside the core can be divided into "one-to-one" and "many-to-one" models. The former uses a core process (maybe a Lightweight Process) to correspond to a thread, thread Scheduling is equivalent to process scheduling and handed over to the core for completion, while the latter implements multiple threads completely outside the core, and scheduling is also in use.
Finished. The latter is the implementation method of the user-level thread model mentioned above. Obviously, this non-core thread Scheduler only needs to switch the thread running stack, and the scheduling overhead is very small, however, because both the core signal (synchronous and asynchronous) are processes, the thread cannot be located. Therefore, this implementation method cannot be used in a multi-processor system, this demand is getting bigger and bigger, because in reality, the implementation of pure user-level threads has almost disappeared except for the purpose of Algorithm Research.
The Linux kernel only supports lightweight processes and limits the implementation of more efficient thread models. However, Linux optimizes the scheduling overhead of processes and makes up for this defect to some extent. Currently, the most popular thread mechanism linuxthreads adopts the "one-to-one" model of thread-process, which is assigned to the core for scheduling, and implements a thread management mechanism including signal processing at the user level. The Linux-linuxthreads operating mechanism is the focus of this article.
Reference:
- Http://blog.csdn.net/jack05/archive/2010/02/02/5281079.aspx
- Analysis on thread implementation mechanism of http://www.ibm.com/developerworks/cn/linux/kernel/l-thread/index.html Linux
- Http://www.net.t-labs.tu-berlin.de/%7Egregor/tools/pthread-scheduling.html Thread Scheduling with pthreads under Linux
And FreeBSD
- Http://www.cnitblog.com/tarius.wu/articles/2277.html
- Http://lwj8666.blog.163.com/blog/static/18966939200911295163799/ Linux Process Scheduling Method (sched_other, sched_fifo, sched_rr)
In modern operating systems, processes support multithreading. A process is the smallest unit of resource management, while a thread is the smallest unit of program execution. A process consists of two parts: a thread collection resource set. A thread in a process is a dynamic object and represents the execution of process commands. Resources, including address space, opened files, and user information, are shared by threads in the process.
The thread has its own private data: Program counters, stack space, and registers.
Why thread? (Disadvantages of traditional single-threaded processes)
<! -- [If! Supportlists] --> 1. <! -- [Endif] --> in reality, there are many tasks that require concurrent processing, such as database servers, network servers, and large-capacity computing.
<! -- [If! Supportlists] --> 2. <! -- [Endif] --> traditional UNIX processes are single-threaded. A single thread means that the program must be executed sequentially and cannot be concurrently. It can only run on one processor at a time, therefore, computers with multi-processor frameworks cannot be fully utilized.
<! -- [If! Supportlists] --> 3. <! -- [Endif] --> If the multi-process method is used, the following problems occur:
A. Fork a sub-process consumes a lot, and fork is an expensive system call, even if the modern copy-on-write technology is used.
B. Each process has its own address space. inter-process collaboration requires complex IPC technologies, such as message passing and shared memory.
Advantages and disadvantages of Multithreading
The advantages and disadvantages of multithreading are actually the unity of opposites.
Multi-threaded programs (processes) can obtain real parallelism, and the communication between threads is convenient because the process code and global data are shared. Its disadvantage is that the thread shares the address space of the process, which may lead to competition. Therefore, some synchronization technologies are required for the data to be accessed by multiple threads in a single thread.
Three threads: Kernel thread, lightweight process, and user thread
Kernel thread
The kernel thread is the separation of the kernel, which can process a specific thing. This is particularly useful in processing asynchronous events such as Asynchronous Io. The use of kernel threads is cheap. The only resource used is the space for storing registers during kernel stack and context switching. A multi-threaded kernel is called a multi-threads kernel ).
Lightweight Process [*]
A lightweight thread (lwp) is a user thread supported by the kernel. It is an advanced abstraction based on Kernel threads. Therefore, lwp is available only when kernel threads are supported first. Each process has one or more lwps, and each lwp is supported by one kernel thread. This model is actually a one-to-one thread model mentioned in the dinosaur book. In this operating system, lwp is the user thread.
Since each lwp is associated with a specific kernel thread, each lwp is an independent Thread Scheduling unit. Even if an lwp is blocked in a system call, the execution of the entire process is not affected.
Lightweight processes have limitations. First, most lwp operations, such as creation, analysis, and synchronization, require system calls. The system call cost is relatively high: you need to switch between user mode and kernel mode. Secondly, each lwp requires support from a kernel thread. Therefore, lwp consumes kernel resources (the stack space of the kernel thread ). Therefore, a system cannot support a large number of lwp.
Note:
<! -- [If! Supportlists] --> 1. <! -- [Endif] --> the term lwp is derived from svr4/MP and Solaris 2.x.
<! -- [If! Supportlists] --> 2. <! -- [Endif] --> some systems call lwp a virtual processor.
<! -- [If! Supportlists] --> 3. <! -- [Endif] --> the reason for calling a lightweight process is that, with the support of kernel threads, lwp is an independent scheduling unit, just like a common process. So the biggest feature of lwp is that each lwp has a kernel thread support.
User thread
Although lwp is essentially a user thread, the lwp thread library is built on the kernel. Many lwp operations must be called by the system, so the efficiency is not high. Here, the user thread refers to the thread library fully established in the user space. The creation, synchronization, destruction, and scheduling of user threads are completely completed in the user space without the help of the kernel. Therefore, such thread operations are extremely fast and low-consumption.
It is the first user thread model. It can be seen that the process contains threads, which are implemented in the user space. The kernel does not directly schedule the user thread process, the scheduling object of the kernel is the same as that of the traditional process. The kernel does not know the existence of the user thread. Scheduling between user threads is implemented by the thread library implemented in the user space.
This model corresponds to the many-to-one thread model mentioned in the dinosaur book. Its disadvantage is that if a user thread is blocked in system calls, the whole process will be blocked.
Enhanced User thread-user thread + lwp
This model corresponds to the many-to-many model in the dinosaur book. The user thread library is still completely built in the user space, so the user thread operations are still very cheap, so you can create any number of user threads required. The operating system provides lwp as a bridge between the user thread and the kernel thread. Lwp is also the same as previously mentioned. It has kernel thread support and is the scheduling unit of the kernel. In addition, the system calls of user threads must be implemented through lwp, therefore, the blocking of a user thread in a process does not affect the execution of the whole process. The user thread library associates established User threads with lwp. The number of lwp threads is not necessarily the same as that of user threads. When the kernel is scheduled to an lwp, the user Thread associated with the lwp is executed.
Summary:
Many documents have considered that a lightweight process is a thread. In fact, this statement is not completely correct. From the previous analysis, we can see that only when the user thread is completely composed of a lightweight process, it can be said that a lightweight process is a thread.