Global interpreter lock -- GIL, global interpreter -- gil
Reference blog: https://www.cnblogs.com/mindsbook/archive/2009/10/15/thread-safety-and-GIL.html
Https://www.cnblogs.com/MnCu8261/p/6357633.html
Http://python.jobbole.com/87743/
I. Preface
In the context of multi-core cpu, multi-thread-based programming methods that fully utilize hardware resources are constantly evolving, that is, multiple tasks can be run at the same time. However, due to the existence of GIL in Cpython, only one thread is running at a time. GIL is called Global Interpreter Lock, which is the Global Interpreter Lock. GIL is a real global thread exclusive lock in the mainstream execution environment Cpython in Python. This GIL lock is required when the interpreter executes any Python code. Although the thread library of CPython directly encapsulates the native threads of the operating system, the CPython process as a whole only has one thread that obtains GIL at a time, other threads are waiting for GIL release. GIL causes CPython to be unable to use the performance of multiple physical cores to accelerate operations.
Different threads are allocated to different cores for running, but only one thread is running at the same time.
Ii. Why does GIL 2.1 thread security exist?
To take advantage of multi-core, we can adopt multi-process or multi-thread. The difference between the two is whether resources are shared. The former is independent, while the latter is shared. Compared with processes, the biggest problem in a multi-threaded environment is ensuring resource competition, deadlocks, and data modification. So we have thread security.
Thread security is in a multi-threaded environment. Thread Security ensures that the program runs correctly when multiple threads are executed at the same time, and that shared data can be accessed by multiple threads, however, only one thread can be accessed at a time.
Since resource competition must exist in a multi-threaded environment, how can we ensure that only one thread can access Shared resources at the same time?
LockTo ensure the uniqueness of the access operation, so that only one thread can access the shared data at the same time.
Usually there are two types of locks with different granularity:
- Fine-grained (the so-called fine-grained), the programmer needs to manually add, unlock to ensure thread security
- Coarse-grained (so-called coarse granularity), the language layer maintains a global lock mechanism to ensure thread security.
The previous method is typically java or Jython, and the latter is CPython (Python ).
2.2 characteristics of Python
According to Python's philosophy,SimpleIt is a very important principle. Therefore, GIL is also well understood. Multi-core CPU was still a sci-fi type in the 1990 s. When Guido van rosum created python, he could not think that his language would be used one day, and it is very likely that the multi-core CPU would be used, A global lock to deal with multi-thread security should be the simplest and economical design in that era. Simple and able to meet the needs, that is, the appropriate design (for the design, it should only be appropriate or not, but not good or bad ).
Iii. Thread Switching
No matter when a thread starts to sleep or waits for network I/O, other threads always have the opportunity to get GIL to execute Python code. This is collaborative multitasking. CPython also supports preemptible multitasking. If a thread continuously runs 100 commands in Python 2, or runs for 15 milliseconds in Python 3, it will discard GIL and other threads can run.
3.1 collaborative multi-task processing
When a task, such as network I/O, is started, and does not run any Python code for a long or uncertain time, a thread will give up GIL, so that other threads can obtain GIL and run Python. This kind of polite behavior is called collaborative multi-task processing. It allows concurrency and multiple threads wait for different events at the same time.
def do_connect(): s = socket.socket() s.connect(('python.org', 80)) # drop the GIL for i in range(2): t = threading.Thread(target=do_connect) t.start()
Only one of the two threads can execute Python at the same time, but once the thread starts to connect, it will give up GIL so that other threads can run. This means that two threads can wait for the socket connection concurrently, which is a good thing. They can do more work in the same time.
3.2 preemptible multitasking
If there is no I/O interrupt, but a CPU-intensive program, the interpreter will give up GIL after running for a period of time, without being allowed by the thread that is executing the code, in this way, other threads can run. In python3, the interval is 15 ms.
Iv. thread security in Python
If a thread can lose GIL at any time, you must make the code thread safe. However, Python programmers have different views on thread security than C or Java programmers, because many Python operations are atomic.
An example of an atomic operation is to call sort () in the list. The thread cannot be interrupted during sorting. Other threads never see the list sorting part or the expired data before the list sorting. Atomic operations simplify our lives, but there are also accidents. For example, ++ = seems simpler than the sort () function, but ++ = is not an atomic operation.
In python 2 (the result in python3 is correct ):
# -*- coding: UTF-8 -*-import timeimport threadingn = 0def add_num(): global n time.sleep(1) n += 1if __name__ == '__main__': thread_list = [] for i in range(100): t = threading.Thread(target=add_num) t.start() thread_list.append(t) for t in thread_list: t.join() print 'final num:', n
Output:
[root@MySQL ~]# python mutex.py final num: 98[root@MySQL ~]# python mutex.py final num: 100[root@MySQL ~]# python mutex.py final num: 96[root@MySQL ~]# python mutex.py final num: 99[root@MySQL ~]# python mutex.py final num: 100
The expected result is 100, but not necessarily.
The reason is that there is A thread switching in the running process, and A thread loses GIL. When A thread A acquires n = 43, it loses GIL if n + = 1 has not been completed, in this case, another thread B obtains GIL and n = 43. After B completes the operation, n = 44. But the previous thread A got GIL again, started to run again, and finally completed the operation n = 44. All the final results will be deviated.
It is the process in which n + = 1 loses GIL when running half of it and then obtains GIL.
5. Mutex lock
How can we solve the preceding deviations to ensure the correctness of the results? In fact, what we need to do is to ensure that each running process is complete, that is, after each thread obtains GIL, it needs to release the GIL lock after calculating the shared data. Then how can we achieve this? Or lock the running program to make sure that the program is fully running.
#-*-Coding: UTF-8-*-import timeimport threadingn = 0 lock = threading. lock () # instance def add_num (): global n with lock: # obtain the lock n + = 1if _ name _ = '_ main _': thread_list = [] for I in range (100): t = threading. thread (target = add_num) t. start () thread_list.append (t) for t in thread_list: t. join () # The main thread waits for all threads to finish executing print 'final num: ', n
Note: When the program is locked, the program becomes serial. Therefore, the program cannot have sleep, and the data volume cannot be too large. Otherwise, the efficiency will be affected.