process: A process is an instance of the execution of a program, which is a collection of data structures that the program has performed to the extent of the lesson. From the kernel point of view, the purpose of the process is to assume the basic unit of allocating system resources (CPU time, memory, etc.).
Thread: A thread is an execution flow of a process that is the basic unit of CPU dispatch and dispatch, which is a smaller unit that can run independently than a process. A process consists of several threads (a user program with many relatively independent execution flows that shares most of the data structures of the application), and the threads share all the resources owned by the process with other threads that belong to one process.
"Process-the smallest unit of resource allocation, thread-the smallest unit of program execution"
The process has a separate address space, and after a process crashes, it does not affect other processes in protected mode, and the thread is just a different execution path in a process. Thread has its own stack and local variables, but the thread does not have a separate address space, a thread dead is equal to the entire process to die, so the multi-process program is more robust than multi-threaded programs, but in the process of switching, the cost of large resources, efficiency is worse. But for some concurrent operations that require simultaneous and shared variables, only threads can be used, and processes cannot be used.
In general, the process has a separate address space, and the thread does not have a separate address space (the address space of the thread sharing process within the same process).
=======================================================================================
(The following is an excerpt from multithreaded programming under Linux)
One of the reasons for using multithreading is that it is a very "frugal" multi-tasking approach compared to a process. We know that under the Linux system, starting a new process must be assigned to its own address space, creating numerous data tables to maintain its code snippets, stack segments, and data segments, which is an "expensive" multi-tasking way of working. While running on multiple threads in a process that use the same address space, sharing most of the data, starting a thread is much less than the space it takes to start a process, and the time it takes to switch between threads is much less than the time it takes to switch between processes. according to statistics, in general, the cost of a process is about 30 times times the cost of a thread, of course, on a specific system, this data may be significantly different.
The two reasons for using multithreading are: a convenient communication mechanism between threads. For different processes, they have independent data space, it is not only time-consuming, but also inconvenient to transmit the data only by means of communication. Threads do not, because data space is shared between threads in the same process, so that the data of one thread can be used directly by other threads, which is not only fast, but also convenient. Of course, the sharing of data also brings some other problems, some variables can not be modified by two threads at the same time, some of the sub-programs declared as static data more likely to have a catastrophic attack on the multi-threaded program, these are the most important to write a multi-thread programming.
In addition to the advantages mentioned above, not compared with the process, multi-threaded procedure as a multi-tasking, concurrent work, of course, the following advantages:
- Improve application responsiveness. This is especially meaningful to the graphical interface program, when an operation takes a long time, the entire system waits for this operation, the program does not respond to the keyboard, mouse, menu operation, and the use of multi-threading technology, the time-consuming operation (consuming) into a new thread, can avoid this embarrassing situation.
- Make multi-CPU systems more efficient. The operating system guarantees that when the number of threads is not greater than the number of CPUs, different threads run on different CPUs.
- Improve the program structure. A long and complex process can be considered to be divided into multiple threads and become a separate or semi-independent part of the run, which facilitates understanding and modification.
=======================================================================================
From the function call, process creation uses the fork () operation, and thread creation uses the Clone () operation. Master Richard Stevens said:
fork is expensive. Memory is copied from the parent to the child, all descriptors be duplicated in the child, and so on. Current implementations use a technique called copy-on-write which avoids a copy of the parent's data space to the child Until the child needs its own copy. But, regardless of the optimization, fork is expensive.
IPC is required-pass information between the parent and child after the fork. Passing information from the parent to the child before the fork was easy, since the child starts with a copy of T He parent's data space and with a copy of the the parent ' s descriptors. But, returning information from the child to the parent takes more work.
Threads help with both problems. Threads is sometimes called lightweight processes since a thread is "lighter weight" than a process. That's, thread creation can be 10–100 times faster than process creation.
All threads within a process share the same global memory. This makes the sharing of information easy between the threads, but along with this simplicity comes the problem
(note) The difference between Linux processes and threads