The difference between a thread and a process

Last Update:2015-04-13 Source: Internet

Author: User

Tags semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction of thread's basic concept thread

If the purpose of introducing a process in the operating system is to enable multiple programs to execute concurrently to improve resource utilization and system throughput, then the introduction of threads in the operating system is to reduce the time and space overhead that the program spends concurrently executing, making the OS more concurrency-efficient. To illustrate this, let's start by reviewing the two basic properties of the process:
The ① process is an independent unit that can have resources;
The ② process is also a basic unit that can be dispatched and distributed independently. It is because the process has these two basic attributes that it becomes a basic unit that can run independently and thus forms the basis for concurrent execution of the process.
　　
However, in order for the program to execute concurrently, the system must also perform the following series of operations.
1) Create a process
When a process is created, it must be assigned all resources, such as memory space, I/O devices, and the corresponding PCB, necessary for it to be in addition to the processor.
2) Undo Process
When the system revokes a process, it must then recycle the resources it occupies before undoing the PCB.
3) Process Switching
When switching to a process, it takes a lot of processor time to keep the current process's CPU environment and set the CPU environment for the newly selected process.
　　
In other words, because a process is the owner of a resource, the system must pay a large space-time overhead for creating, undoing, and switching. Because of this, the process set in the system, its number should not be too much, the frequency of process switching should not be too high, which limits the degree of concurrency to further improve.
　　
How to make multiple programs perform concurrently and minimize the overhead of the system has become an important goal in designing the operating system in recent years. There are many researchers who study the operating system, if the two attributes of the process can be separated from the operating system separately, that is, as the basic unit of dispatch and dispatch, not at the same time as a unit of resources, to achieve "light", and for the basic unit of resources, and do not have frequent switching. It is under the guidance of this idea that the concept of threading is formed.
　　
With the development of VLSI technology and computer architecture, a symmetric multiprocessor (SMP) computer system has emerged. It provides a good hardware foundation for improving the running speed and system throughput of the computer. However, to enable multiple CPUs to work well in a coordinated manner, to give full play to their parallel processing capabilities to improve system performance, you must also configure a well-performing multiprocessor OS. But using the traditional process concept and design method, it is difficult to design the OS of the computer system suitable for SMP structure. This is because the process is "too heavy", resulting in the implementation of process scheduling, dispatching, and switching in multiprocessor environments, which takes a significant amount of time and space overhead. If threading is introduced into the OS and threads are used as the basic unit of dispatch and dispatch, the performance of multiprocessor system can be effectively improved. As a result, some of the major OS (UNIX, OS/2, Windows) manufacturers have further developed threading technology for use in SMP computer systems.

Thread-to-process comparison

Threads have the characteristics of many traditional processes, so they are also referred to as lightweight processes (Light-weight process) or process elements, which, in turn, refer to traditional processes as heavy processes (Heavy-weight process), which are equivalent to tasks with only one thread. In the operating system in which threading is introduced, typically a process has several threads, or at least one thread. Below we compare threads and processes from the aspects of scheduling, concurrency, overhead, and owning resources.

Dispatch
in a traditional operating system, as the basic unit with resources and the basic unit of independent Dispatch and dispatch, are processes. In the introduction of the threading operating system, the thread as the basic unit of dispatch and dispatch, and the process as the basic unit of resources, the traditional process of the two attributes are separated, so that the thread basically does not have resources, so that the thread can move light, so that the system can significantly improve the concurrency level. In the same process, the switch of a thread does not cause a switchover of the process, but when a thread in one process switches to a thread in another process, it causes the process to switch.
Concurrency
in the introduced threading operating system, not only the process can be executed concurrently, but also in a process of concurrent execution between multiple threads, so that the operating system has better concurrency, which can more effectively improve the utilization of system resources and system throughput. For example, in a single-CPU operating system that does not introduce threads, if only one file service process is set up, there is no other file services process to provide services when the process is blocked for some reason. In the operating system in which the thread is introduced, you can set up multiple service threads in a file service process. When the first thread waits, the second thread in the file service process can continue to provide the file service, and when the second thread is blocked, it can be performed by a third, serving the service. Obviously, such a method can significantly improve the quality of the file service and the throughput of the system.
Has resources
Whether it is a traditional operating system or the introduction of a thread's operating system, a process can have resources and is a basic unit of resources in the system. In general, a thread does not own system resources (and is a bit of an essential resource), but it can access the resources of its subordinate processes, that is, the code snippets, data segments, and system resources owned by a process, such as open files, I/O devices, etc., that can be shared by all threads in the process.
System overhead
When you create or revoke a process, the system creates and reclaims process control blocks for it, allocates or reclaims resources such as memory space and I/O devices, and the operating system spends significantly more than the cost of thread creation or revocation. Similarly, when the process is switched on, it involves the current CPU environment of the process and the setting of the CPU environment of the newly scheduled running process, while the thread switching only needs to save and set a small amount of register content, does not involve memory management operations, so in terms of switching costs, the process is far higher than the thread. In addition, because multiple threads in a process have the same address space, threads are also easier to implement in terms of synchronization and communication than processes. In some operating systems, thread switching, synchronization, and communication do not require the intervention of the operating system kernel.

Processes in multi-threaded OS

In a multithreaded OS, a process is a basic unit of system resources, and typically processes contain multiple threads and provide resources for them, but at this point the process is no longer an executing entity. The processes in the multithreaded OS have the following properties:

Units that are allocated as system resources. In a multithreaded OS, the process is still the basic unit of allocation of the system resources, and the resources in either process include the user address space that is protected separately, the mechanisms used to implement inter-process and inter-thread synchronization and communication, the open files and the requested I/O devices, and an address mapping table maintained by the core process , the table is used to implement a mapping of the user program's logical address to its physical address of memory.
can include multiple threads. Typically, a process contains multiple, relatively independent threads, many of which can be as few as possible, but at least one thread that the process provides resources and a running environment for those threads to execute concurrently. All threads in the OS can belong to only one particular process.
The process is not an executable entity. In multi-threaded OS, the thread is used as the basic unit of independent operation, so the process is no longer an executable entity at this time. Even so, the process still has a state associated with execution. For example, the so-called process is in the "Execution" state, which actually means that a thread in the process is executing. In addition, actions on the process state that are imposed on the process also work on their threads. For example, when a process is suspended, all threads in the process are also suspended, and all threads that belong to the process are activated when a process is activated.

Synchronization and communication mutexes between threads (mutexes)

Mutexes are a relatively simple mechanism for implementing mutually exclusive access to resources between threads. Because of the low time and space overhead of operating mutexes, it is more suitable for critical shared data and program segments that are used at high frequencies. Mutexes can have two states, the unlock (unlock) and the Unlock (lock) states. Accordingly, the mutex can be manipulated using two commands (functions). Where the unlock lock operation is used to close the mutex, the unlock operation unlock is used to open the mutex.

When a thread needs to read/write a shared data segment, the thread should first perform a lock-down command for the mutex set by that data segment. The command first discriminant the state of the mutex, and if it is already in the unlock state, the thread attempting to access the data segment will be blocked, and if the mutex is unlocked, then the mutex is closed to read/write the data segment. After the thread has finished reading/writing the data, the unlock command must be issued again to open the mutex, but also to wake up one of the threads blocking the mutex, while the other threads are still blocked on the queue waiting for the mutex to open.

In addition, in order to reduce the chance of thread being blocked, a kind of operation command Trylock on the mutex is also provided in some systems. When a thread accesses a mutex using the Trylock command, if the mutex is unlocked, Trylock returns a status code indicating success, whereas if the mutex is in unlock state, Trylock does not block the thread, but only returns a state indicating that the operation failed. Code.

Condition variable

In many cases, using only mutexes to achieve mutually exclusive access can cause deadlocks, and we illustrate this with an example. After a thread has successfully performed a lock on mutex 1, it enters a critical section C, and if the thread has to access a critical resource R in the critical section, another mutex mutex 2 is also set for R. If the resource R is busy at this time, the thread must be blocked after performing a lock on mutex 2, which will keep the mutex 1 in a unlock state, and if the thread holding the resource R also requires a critical section C, but because the mutex 1 has remained unlock state and cannot enter the critical section, This creates a deadlock. To solve this problem, a conditional variable is introduced.

Each condition variable is usually used with a mutex, that is, a condition variable is contacted when a mutex is created. A simple mutex is used for short-term locking, mainly to ensure mutual exclusion of critical areas. The condition variable is used for the long wait of the thread until the resource that is waiting becomes available.

Now, let's look at how to use mutexes and condition variables to achieve access to resource R. The thread first performs a lock-off operation on the mutex, enters the critical section if it succeeds, and then finds the data structure that describes the state of the resource to understand the resource situation. As soon as the resource is found to be busy, the thread turns to wait and unlocks the mutex, waits for the resource to be released, if the resource is idle, indicates that the thread can use the resource, then sets the resource to busy and unlocks the mutex. A description of the application (left half) and release (right half) of the above resources is given below.

   Lock mutex                     Lock mutex      check data structures；                 as free；      while(resource busy)；                unlock mutex；        waitvariable)；         variable)；      as

The thread that originally owned the resource R releases the resource as described in the right half after the resource has been used, where wakeup (condition variable) means to wake up one or more threads waiting on the specified condition variable. In most cases, because a critical resource is released, only one of the threads waiting on the condition variable is awakened, and the other threads continue to wait on that queue. However, if a thread frees a data file, the file allows multiple threads to perform read operations on it at the same time. In this case, when a write thread finishes writing and frees the file, if there is more than one read thread waiting on the condition variable at this point, the thread can wake up all the waiting threads.

Semaphore mechanism

The most common tool for implementing process synchronization, the semaphore mechanism described earlier, can also be used in multi-threaded OS to enable synchronization between threads or processes. For increased efficiency, you can set the corresponding semaphore for threads and processes individually.
　　
1) Private signal volume (private samephore)
When a thread needs to use semaphores to achieve synchronization between threads in the same process, a command that creates a semaphore can be called to create a private semaphore whose data structure resides in the application's address space. The private semaphore is owned by a particular process, and the OS is unaware of the presence of a private semaphore, so that once the occupier of a private semaphore has ended abnormally or ends normally, but does not release the space occupied by that semaphore, the system will not be able to restore it to 0 (empty) or to the next thread that requests it.
　　
2) Common semaphore (public semaphort)
The common semaphore is set to achieve synchronization between threads in different processes or in different processes. Because it has an open name for all processes to use, it is called a common semaphore. The data structure is stored in the protected system storage area, which is allocated space and managed by the OS, so it is also called the system semaphore. If the semaphore's possessor does not release the common semaphore at the end, the OS automatically reclaims the semaphore space and notifies the next process. Visible, the common semaphore is a more secure synchronization mechanism.

The difference between a thread and a process

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More