Processes and Threads
Two features of the process concept:
Resource Ownership : A process consists of a virtual address space that holds the process image (including programs, data, stacks, and process control blocks). Processes have control over and ownership of resources, including main memory, I/O channels, I/O devices and files, etc.;
Dispatch/Execution : a process is executed along an execution path through one or more programs, and its execution may alternate with the execution of other processes. Therefore, a process with execution status and priority is the entity that the operating system dispatches and assigns;
Modern operating systems in order to differentiate between these two features, the unit of dispatch is called a thread or a lightweight process (lightweight PROCESS,LWP), and the unit with ownership of the resource is a process.
Multithreading
Multithreading refers to the ability to execute multiple threads in a process. The traditional operating system uses a single-threaded approach. UNIX supports multi-user processes, but each thread has only one thread. Windows supports multiple processes, and each process can have multiple threads in a way.
In a multi-process environment, a process is defined as a unit of resource allocation and protection. Associated with the process are:
The virtual address space where the process image is stored;
A protected Access processor;
Other processes (interprocess communication);
File
I/O resources;
In a process, there may be one or more threads, each of which has:
Thread execution state;
The thread context that was saved when it was not run;
an execution stack;
Static storage space for local variables for each thread;
Access to the memory and resources of the owning process, and share those resources with other threads in the process;
In a single-threaded model, a process consists of a process control block, a user address space, and a user stack and kernel stack that manages call/return behavior during execution. In a multithreaded model, there is still process-dependent process control blocks and user address spaces, but each thread has a separate stack, and separate control blocks contain registers, precedence, and other thread-related state information.
All threads in a process share the state and resources in the process, they are in the same block address space, and can access the same data. For example, if a thread opens a file, the file handle can be used by other threads in the same process.
The benefits of threading in terms of performance:
1. Create a new line in a process turndown creating a new process is much faster;
2. Terminating a thread is less than the time it takes to terminate a process;
3. Switching between threads in the same process is faster than switching between processes;
4. Communication between threads in the same process is more efficient. In most operating systems, inter-process communication requires kernel intervention to provide communication mechanisms and protection mechanisms. But in the same process, because threads can share memory and
Files, the communication between them does not need to use the kernel;
Even in a single processor, constructing multithreading is also useful for simplifying programs that logically perform different functions. Examples of using threads: foreground and background operations; asynchronous processing; accelerated execution; Modular program structure.
In a thread-supported operating system, dispatch and dispatch are in the thread, so most execution-related information can be stored in the data structure of the threads level. However, some activities affect all threads in the process, and the operating system must manage them at the process level. For example, a hang will swap the address space of a process out of main memory, because all threads in a process will share the same address space, so they will be suspended. Similarly, the termination of a process causes all threads in the process to terminate.
Thread state
As with processes, the main states of a thread are: Run state, ready state, and blocking state. The suspend state does not make sense to the thread because it belongs to the process-level concept. Four basic operations related to thread state:
Derivation: When a new process is created, a thread is created, and the thread in the process can create a new thread. When a new thread is created, it is provided with instruction pointers and parameters. The new thread has its own register context
and stack space, are placed in the ready queue;
Blocking: When a thread waits for an event, it is blocked (saving its user register, program counter, stack pointer), and then the processor goes to another thread in the same process or another ready thread in a different process
continued implementation;
Unblock: When an event blocking a thread occurs, the thread is transferred to the ready queue;
End: When a thread completes, its register context and stack are freed;
In a single processor, multiple threads in multiple processes can be executed alternately. Executes another thread when the currently running thread is blocked or when the time slice is exhausted.
Thread synchronization
All threads in a process share the same address space and resources. A thread's modification of a resource affects the use of resources by other threads, so it is necessary to synchronize the activities of each thread to ensure that they do not interfere with and destroy the data.
User-level threads (user-level thread) and kernel-level threads (Kernel-level thread)
User-level threads : The management of threads is controlled by the user, and the kernel does not know the existence of threads. Applications can design multithreaded programs through line libraries . Line libraries is a routine package for managing user-level threads, which includes creating and destroying thread code, dispatching code executed by threads , passing messages and data between threads , and recovering code for thread contexts .
By default, the application starts from a single thread and executes from that thread. The application and the process are assigned to a kernel-managed process that can create a new thread at any point in time, and needs to be completed by invoking a derivation (spawn) routine in the thread library. With a procedure call, control is transferred to the Spawn routine, which creates a data structure for the new thread, and then uses the scheduling algorithm to transfer control to a thread that is in the ready state in the process. When control is transferred to the line libraries, the context of the current thread needs to be saved, and then when the control is transferred to a thread again, the context of that thread needs to be restored. The thread context at this point includes the contents of the user register, the program counter, and the stack pointer. The above activities occur within a user process, and the kernel is unaware of these activities. In this case, the kernel is scheduled on a per-process, and gives the process an execution state (run state, ready state, blocking state, and so on).
example, thread scheduling and process scheduling:
Process b thread 1 and thread 22 execution threads, assuming that process B is now executing in its thread 2, any of the following may occur:
1. Thread 2 performs a system call that blocks process B. For example, making an I/O call, which causes control to be transferred to the kernel, the kernel initiates I/O operations, puts process B in a blocking state, and switches to another process. At this point, in the data structure maintained in the thread library, thread 2 of process B is still running, but thread 2 is not actually executed by the processor;
2. The kernel confirms that process B has run out of its time slices, the clock interrupts to transfer control to the kernel, and the kernel takes process B as the ready state and switches to another process. At this point, in the data structure maintained in the thread library, thread 2 of process B is still running, but thread 2 is not actually executed by the processor;
3. Thread 2 executes somewhere and requires thread 1 of process B to perform certain actions. Thread 2 into the blocking state, thread 1 from the ready state to the running state, the process itself is still running state;
In 1 and 22 cases, the execution of thread 2 is resumed when the kernel switches control back to process B. It is important to note that the process may also be interrupted when executing code in the thread library, either because the time slice is exhausted or because it is preempted by a high-priority process. Therefore, at the time of the outage, the process may be in the middle of switching from one thread to another, which is switching from one thread to another. When the process is resumed, it will continue execution in the thread library, complete the switch of the threads, and transfer control to another thread in the process.
The advantages of a user-level thread compared to a kernel-level thread:
1. The data structure of all the threads in a process is managed by the process, and the switch of the thread does not need to switch to kernel mode, the cost of switching from user mode to kernel mode and switching from kernel mode to user mode is reduced;
2. The scheduling method is controlled by the application. Using a simple polling scheduling algorithm, or a priority-based scheduling algorithm, there are application program control, but also do not interfere with the operating system scheduler;
3. User-level threads are easier to cross-platform and do not need to modify the underlying kernel to run on any operating system;
The disadvantage of a user-level thread compared to a kernel-level thread:
1. Because many system calls can cause blocking, when one user-level thread in a process is blocked, it also blocks the entire process;
2. In a purely user-level threading strategy, a multithreaded application cannot enjoy multi-processor technology. The kernel assigns the process one processor at a time, at the same time, only one thread in the process is executed;
Ways to address these weaknesses:
1. Overwrite the application into a multi-process rather than a multithreaded application, but the thread switch changes to a process switch, resulting in excessive overhead;
2. The way to overcome threading blocking is to use a technique called jacketing. The purpose of this technique is to convert a blocking system call into a non-blocking system call. For example, instead of calling a system i/0 routine directly, a thread calls an application-level I/0jacket routine that detects if an I/O device is available. If it is not available, the thread is blocked and executed by another thread in the process, and the availability of the I/O device is detected again when the blocked thread regain control.
kernel-level threading : In a purely kernel-level threading system, the work on thread management is done by the kernel, and the application does not involve thread management. The kernel only provides programming interfaces to the application for kernel-level threading. The kernel maintains contextual information for all threads of a process, and thread scheduling is done in the kernel. This architecture solves the two drawbacks of user-level threading: The kernel can dispatch all threads in a process to multiple processors at the same time, and if one thread in one process is blocked, the kernel can dispatch another thread in the process without causing the entire process to block.
Another advantage of kernel-level threading is that the kernel routines themselves can also use multithreading.
The disadvantage of a kernel-level thread compared to a user-level thread: Switching between threads in the same process requires switching to kernel mode.
As you can see, using a kernel-level thread is significantly faster than using a single-threaded process, and using user-level threads is a certain improvement over kernel-level threading. If the majority of user-level threads switch to kernel mode, the scenario for a user-level thread is not much better than a kernel-level thread.
Combination Scenarios
Some operating systems provide a way to combine user-level threads with kernel-level threads. The creation of threads is still done in user space, and the scheduling and synchronization of threads is also controlled by the application. However, some threads of an application can be mapped to a kernel-level thread, and the application can adjust the number of threads that need to be mapped to kernel-level for best results, depending on the situation. If the design is reasonable, multiple threads of the application can run in parallel, and the thread that causes the blocking does not block the entire process. This can take into account the advantages of user-level threading and kernel-level threading, and avoid their own drawbacks.
The relationship between threads and processes
1:1 relationship: In a traditional single-threaded process, each process has only one thread;
M:1 Relationship: Most modern operating systems use this approach, a process can have one or more threads, the process has the address space and resources ownership;
M:N Relationship: In experimental operating system Trix, we study the many-to-many relationships between threads and processes. There is a domain concept in the operating system, and the domain is a static entity that includes an address space and a port for sending and receiving messages. A thread is an execution path that contains the execution stack, processor state, and scheduling information. Multiple threads can be executed in a domain. can also be executed in multiple domains, in which case the thread can move from one domain to another;
1:m relationship: From the user's point of view, the thread is an active unit, and the process is a virtual address space and the associated process control block. After the thread is created, it starts executing in the process by invoking a program entry point in the process. A thread can be transferred from one address space to another address space;
Operating System Learning Note thread