The difference and pros and cons of Python thread and process

Source: Internet
Author: User
Tags prepare
In this article we will look at what is python threads and processes。 Learn about Python threads and processes, and learn about python threads and processes the Differenceand merits.

We have introduced multi-process and multi-threading, which are the two most common ways to achieve multitasking. Now, let's discuss the pros and cons of both approaches.

First, to achieve multitasking, we typically design the master-worker pattern, where master is responsible for assigning tasks, and the worker is responsible for performing tasks, so in a multitasking environment, it is usually a master, multiple workers.

If you implement master-worker with multiple processes, the master process is master and the other process is Worker.

If the Master-worker is implemented with multithreading, the main thread is master, and the other threads are worker.

The greatest advantage of multi-process mode is high stability, because one child process crashes without affecting the main process and other child processes. (Of course, the main process hangs all the processes are all hung up, but the master process is only responsible for assigning tasks, the probability of hanging off low) The famous Apache is the first multi-process mode.

The disadvantage of the multi-process mode is that the cost of creating the process is large, and under the Unix/linux system, it is OK to make a fork call, and the process overhead under Windows is huge. In addition, the operating system can run simultaneously the number of processes is also limited, under memory and CPU constraints, if there are thousands of processes running simultaneously, the operating system even scheduling will be problematic.

Multithreaded mode is usually faster than a multi-process, but it is not fast, and the fatal disadvantage of multithreaded mode is that any thread that hangs can directly cause the entire process to crash because all threads share the memory of the process. On Windows, if a thread executes a problem with the code, you can often see the hint that "the program is performing an illegal operation that is about to close," which is often a problem with a thread, but the operating system forces the entire process to end.

Under Windows, multithreading is more efficient than multiple processes, so Microsoft's IIS server uses multithreaded mode by default. Because of the stability of multithreading, IIS is less stable than Apache. To alleviate this problem, IIS and Apache now have multi-process + multi-threaded mixed mode, which is the more complicated the problem.

Thread switching

Whether it is multi-process or multi-threading, as long as the number of a lot, efficiency is definitely not going, why?

We make an analogy, assuming that you are unfortunately preparing for the test, every night need to do the language, mathematics, English, Physics, Chemistry, 5 of the homework, each job time 1 hours.

If you take 1 hours to do Chinese homework, finish, and then spend 1 hours doing math homework, so that, in turn, all done, a total of 5 hours, this method is called a single task model, or batch task model.

Suppose you are going to switch to a multitasking model, you can do 1 minutes of language, then switch to math, do 1 minutes, then switch to English, and so on, as long as the switch speed is fast enough, this way and single-core CPU to perform multitasking is the same, to kindergarten children's eyes, you are at the same time write 5 homework.

However, switching jobs is a cost, such as from the language cut to mathematics, to clean up the table of Chinese books, pens (this is called the preservation site), and then open the math textbook, find the compass ruler (this is called to prepare a new environment), to start doing math homework. The operating system is the same when switching processes or threads, it needs to save the current execution of the field environment (CPU register status, memory pages, etc.), and then prepare the execution environment of the new task (restore the last register state, switch memory pages, etc.) before you can start execution. This switching process is fast, but it also takes time. If there are thousands of tasks at the same time, the operating system may be mainly busy switching tasks, there is not much time to perform the task, the most common is the hard drive, the point window unresponsive, the system is in suspended animation state.

Therefore, once the multi-tasking to a limit, it will consume all the resources of the system, resulting in a sharp decline in efficiency, all tasks are not good.

Compute-intensive vs. IO-intensive

The second consideration with multitasking is the type of task. We can divide the task into compute-intensive and IO-intensive.

Compute-intensive tasks are characterized by a large number of computations that consume CPU resources, such as PI, HD decoding video, and so on, all relying on the computing power of the CPU. This computationally intensive task can be accomplished with multitasking, but the more tasks, the more time it takes to switch tasks, the less efficient the CPU is to perform the task, so the most efficient use of the CPU should be equal to the number of cores in the CPU.

Compute-intensive tasks are critical to the efficiency of your code because they consume CPU resources primarily. Scripting languages like Python are inefficient and are completely unsuitable for compute-intensive tasks. For computationally intensive tasks, it is best to write in C.

The second type of task is IO-intensive, the tasks involved in network, disk IO are IO-intensive tasks, which are characterized by low CPU consumption and most of the time the task is waiting for the IO operation to complete (because IO is much slower than CPU and memory speed). For IO-intensive tasks, the more tasks you have, the higher the CPU efficiency, but there is a limit. Most of the tasks that are common are IO-intensive tasks, such as Web applications.

IO-intensive task execution, 99% of the time spent on the IO, the time spent on the CPU is very small, so the fast-running C language to replace the very low-speed scripting language with Python, completely unable to improve operational efficiency. For IO-intensive tasks, the most appropriate language is the most efficient (least code) language, the scripting language is preferred, and the C language is the worst.

Asynchronous IO

Considering the huge speed difference between CPU and IO, a task is waiting for the IO operation most of the time during execution, and the single-process single-threaded model causes other tasks not to be executed in parallel, so we need a multi-process model or multithreaded model to support multitasking concurrent execution.

The modern operating system has made great improvements to IO operations, with the biggest feature being the support for asynchronous IO. If you take advantage of the asynchronous IO support provided by the operating system, you can use a single-process single-threaded model to perform multitasking, a new model called the event-driven model, Nginx is a Web server that supports asynchronous IO, and it can efficiently support multitasking by using a single-process model on a single-core CPU. On multi-core CPUs, you can run multiple processes (the same number as the number of CPU cores) to take advantage of multicore CPUs. Because the total number of processes in the system is very limited, operating system scheduling is very efficient. Using the asynchronous IO programming model to achieve multi-tasking is a major trend.

corresponding to the Python language, the single-threaded asynchronous programming model is called the co-process, and with the support of the coprocessor, an efficient multitasking program can be written based on event-driven.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.