The concept of threading has been put forward before the advent of multicore CPUs, and multithreading in the single-core era is intended to make the CPU as idle as possible, so that its computing power can always be exploited. But essentially, only one thread is executing at any given time.
Although only one thread is executing at any moment, there are still some problems to be solved, the most important of which is thread safety. The source of this problem is very simple, as I said before, the CPU is a single execution of the instructions, but note that the high-level language line of code in the assembly language level, may consist of a number of assembly instructions. For a simple example, do a self-increment for a variable in the C language:
int a = 0;int add(){ a += 1; return a;}
But it is not so easy to become assembly instructions, where the implementation of the self-increment is mainly these two instructions (most of the instructions omitted):
movl a(%rip), %eax addl $1, %eax
The reason for using global variables in this example is that thread safety and data synchronization are primarily resources that can be shared against global variables. The above A (%rip) represents the value of the global variable A, and%eax is a register. The first instruction indicates that the value of a is stored in the EAX register, and the second instruction adds 1 to the value in the EAX.
Let's consider a situation where there are now two threads, A, B, that call the Add () function, so the expected value for a is 2. If a context switch occurs after thread a executes the first instruction, the value of the EAX register is the initial value of a 0,cpu go to thread B, after thread B executes, 1 is returned to a (%rip), and then thread A is resumed, since thread A will not execute the first instruction again. So the value of the EAX register is not updated, it is still 0, and after thread a executes, 1 is returned to a (%rip). Finally, the value of a (%rip) is 1 instead of 2.
The above situation is that we often say that the data is out of sync, or the thread is unsafe. In this respect, people put forward a lot of methods, such as atomic operation, mutual exclusion lock and so on, the starting point is the following two kinds:
- Ensure that the operation is not interrupted by the thread scheduling mechanism, either complete or not;
- Even if the operation is interrupted by a thread-scheduling mechanism, other threads cannot get access to the associated resource.
Now go back to the Python language. Dutch Guido van Rossum invented the scripting language Python in 1989 in order to pass the time. Like Java, the Python source code is first compiled into bytecode (the PYc file in the Python project), which is then interpreted by the interpreter. It is similar to a more intermediate language between high-level languages and assembly language.
, a Python expression can be composed of multiple interpreter directives, and an interpreter instruction can be divided into multiple assembly instructions, which means that an interpreter instruction may be interrupted during execution, and it is true that the Python interpreter CPython is not thread-safe. Therefore, in order to ensure thread safety, the first thing to do is to let an interpreter instruction can not be affected by the thread scheduling effect is completed, the Python interpreter developers have developed a Python global interpreter lock, referred to as Gil. The Gil is allowed to run only one thread at any one time, and when a thread execution time reaches the threshold, the Gil is freed, so the connection schedule is much simpler.
So does Gil solve the problem of thread safety? No. This is a deep pit in Python. Next blog I will write some of my own experience in learning Gil.
Play Python (2) Multithreading history 2