Python: threads, processes, and co-routines (1)--Concepts

Source: Internet
Author: User
Tags mutex

The most recent spare time is mainly in the learning of Python threads, processes and the process, the first time with Python multithreading and multi-process is two months ago, it was just a simple look at a few blog post and then use, did not study carefully, the first use of the feeling they are actually quite simple, the recent period of time by reading, Look at the official Python documents and so on, and found that it is not so easy to think of, a lot of knowledge points need to be carefully understood, Python threads, processes and the process should be the advanced use of Python. Python has a lot of advanced usage, just look at the official Python documentation, and it's helpful to have time to see how these modules are implemented. The choice of programming this industry, is to continue to learn, think, summed up experience, the road long its repair far XI, I will go up and down and quest, hope to share with you. Next to spend a few articles on the length of my study thread, process and the experience of the association, there is a bad place, I hope you criticize correct. This blog post is mainly about concepts.

(i) Threading and multithreading

Threads

(1) A thread, sometimes called a lightweight process (lightweight PROCESS,LWP), is the smallest unit of program execution flow.

(2) A standard thread consists of a thread ID, the current instruction pointer (PC), a collection of registers, and a stack. With these it is able to record the context of where you are running to, which can be called threads.

(3) The running of the thread may be preempted (interrupted) or temporarily suspended (also called sleep) for other threads to run, which is called concession.

(4) Threads also have three basic states of readiness, blocking, and running. A ready state is a thread that has all of the conditions running, is logically capable of running, is waiting on a processing machine, and a running state is a thread-owning processor that is running; a blocking state is a thread waiting for an event, such as a semaphore, to be logically unenforceable.

(5) A thread is an entity in a process, is the basic unit of dispatch and dispatch by the system, and the thread itself does not own the system resources independently, but it can share all the resources owned by the process with other threads belonging to one process.

Multithreading

(1) Each application has at least one process and one thread. A thread is a single sequential control flow in a program. Running multiple threads at the same time in a single program completes different tasks that are divided into pieces, called multithreading.

It's good to understand that development software is everyone or each group is responsible for a module, when everyone (group) related modules are written, then start to merge the code, and then test, fix the bug.

(2) The use of multi-threading on a single-processor machine will not speed up the execution of the code, or even increase the overhead of some thread management, unless the code relies on the first-party resources.

In fact, in a single-processor system, each thread is scheduled to run only a small session at a time, then let the CPU out and let the other threads run. such as thread switching and so on, these are to spend resources and time, so the use of multithreading on a single CPU machine sometimes not only feel the execution speed is faster, but slower.

(3) Multithreading can benefit from multiprocessor or multicore machines, which execute each thread in parallel on each processor, increasing execution speed.

(4) The running results can be shared between threads. However, there is a certain risk, such as two threads to update the same data, but the results of the two threads are different, this is called the race condition, which can cause competition hazards, unpredictable results occur. Therefore, the use of lock mechanism can protect the data.

Multithreading in Python

Python's multithreading is not as idealistic as it is because there is a limit to what is called Gil. Then what is Gil? The Gil Chinese name is a global interpreter lock, a mechanism used as a mutex on a Python virtual machine to ensure that only one thread is running on the virtual machine under any circumstances, while other threads are waiting for the Gil Lock to be released . So it's a "pseudo-multithreading", and it's just like the one that runs on a single processor machine, it doesn't speed up the execution of the code, or even increase the overhead of some thread management.

The multithreading on a Python virtual machine is performed as follows:

A. Set GIL

b, switch to a thread to run;

C, run a specified number of bytecode instructions or threads actively give up control (can call Time.sleep (0));

D, set the thread to sleep state;

e, unlock GIL;

F. Repeat all of the above steps again


The Gil will be locked until the function ends (because no Python bytecode is running during this time, so the thread switch is not done) when calling external code, such as C + + extension functions. For example, with I/O operations (call the built-in operating system C code, I/O operation is the input and output operation, in order to learn more about it can refer to other data) of the thread, Gil will be released before the I/O operation is called.

For a purely computational program, there is no I/O operation, and the interpreter automatically switches between threads based on the settings of Sys.ssetcheckinterval (), which, by default, releases the Gil lock every 100 clocks and rotates to other threads to execute.


So why is the Gil introduced in Python in multiple threads? is to guarantee the mutex of shared resource access within the virtual machine . Object management of Python objects is closely related to reference counters, when the value of the counter is 0, the object is reclaimed by the garbage collector (who does not know the knowledge can be found on the Internet or read the "Python source code resolution"), when revoking a reference to an object, The Python interpreter will perform the following two steps on the object and its counter management:

A. Subtract 1 from the reference counter

B. Determine if the value of the counter is 0 and if 0, the object is destroyed


Assuming that there are now a, b two threads referencing the same object obj simultaneously, the value of the reference counter of the Obj object is 2, and if the A thread is now going to undo a reference to obj, when the first step "reduce the reference counter value by 1" is performed, a happens to be suspended at this key point due to the existence of a multithreaded scheduling mechanism , and entered the state of the execution of the B thread, if this time the B thread is also to revoke the reference to obj, and completed the above, A, a, a two, then the reference counter of obj is 0, the Obj object is destroyed, the memory is released, the trouble may arise, when a thread is again awakened, It will definitely go through the B-step above, and the results are completely unrecognizable, so the results are totally unknown. Therefore, the Gil is introduced to ensure the mutual exclusion of shared resource access within the virtual machine.


The introduction of Gil makes multithreading impossible to take advantage of in multicore systems, but it also brings some benefits, which greatly simplifies the management of shared resources in Python threads. But Python offers other ways to bypass the Gil's limitations to take full advantage of multicore computing power, such as multi-process multiprocessing modules, C-language extensions, ctypes libraries, and more.



(ii) process

A process (sometimes called a heavyweight process) is a single execution of a program. Each process has its own address space, memory, data stack, and other secondary data that records its running trajectory. The operating system manages all the processes running on it and distributes the time fairly for those processes. Processes can also perform other tasks through fork and spawn operations, but each process has its own memory space, data stack, and so on, so it can only use interprocess communication (IPC) instead of sharing information directly.


(iii) Co-process

(1) The process is a user-level lightweight thread, different from the thread is that the process is not the operating system to switch, but by the programmer code to switch, that is, the switch is controlled by the programmer, so that there is no thread so-called security issues.

(2) The co-process has its own register context and stack. When the schedule is switched, the register context and stack are saved elsewhere, and the previously saved register context and stack are restored when it is cut back.

How does Python use the co-process? The answer is to use the Gevent module. The use of the coprocessor can be not constrained by thread overhead. So the most recommended method, is the multi-process + association (can be seen as a single thread in each process, and this single-threaded is a covariant) multi-process + coprocessor, to avoid the CPU switching overhead, but also to make full use of multiple CPUs.


The above is the thread, process and the understanding of the association, I hope to help you!

This article is from the "11016142" blog, please be sure to keep this source http://11026142.blog.51cto.com/11016142/1864799

Python: threads, processes, and co-routines (1)--Concepts

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.