More than 10 years, no more than the interpreter Global Lock (GIL) makes Python novice and expert more frustrated or more curious.
Unresolved issues
Problems are everywhere. The difficulty and time-consuming is certainly one of the problems. Just trying to solve the problem is surprising. It was an attempt by the entire community, but now it's just the periphery of the developer effort. For the novice, to try to solve such problems, mainly because the problem is large enough, after the settlement can be a considerable honor. The unresolved P = NP in computer science is such a problem. If we can give the answer to the polynomial time complexity, it can change the world. Python's most difficult problem is easier than proving P = NP, but there is still no satisfactory solution, knowing that the practical solution to the problem can also be transformative. That's why it's easy to see so many people in the Python community focus on the question: "What can I do with the interpreter global lock?"
The bottom of Python
To understand what Gil means, we need to start with the basics of Python. Language such as C + + is a compiled language, the so-called compiler language, refers to the program input to the compiler, the compiler then according to the syntax of the language parsing, and then translated into a language independent intermediate representation, and ultimately linked to a highly optimized machine code executable program. The compiler can optimize the code in depth because it can see the entire program (or a chunk of its own). This makes it possible to infer the interaction between different language instructions, thus giving a more efficient means of optimization.
In contrast, Python is an interpreted language. The program is entered into the interpreter to run. The interpreter does not understand the program until it is executed, and it knows only the rules of Python and how the rules are applied dynamically during execution. It also has some optimizations, but this is basically just another level of optimization. Since the interpreter is unable to deduce the program very well, most of Python's optimizations are actually optimizations of the interpreter itself. A faster interpreter naturally means that the program can run "free" faster. In other words, when the interpreter is optimized, the Python program can enjoy the benefits of optimization without making changes.
This is important, let us emphasize again. If other conditions are not changed, the execution speed of the Python program is directly related to the "speed" of the interpreter. No matter how you optimize your program, the speed at which your program executes depends on the efficiency of the interpreter executing your program. This clearly explains why we need to do so much work on optimizing the Python interpreter. For Python programmers, this is probably the closest thing to a free lunch.
The free lunch is over.
Or is it not over? Moore's law gives the hardware speed to grow in a definite time period, while an entire generation of programmers learns how to encode. If a person writes slower code, the simplest result is usually a faster processor to wait for the code to execute. It is clear that Moore's law is still correct and will take effect for a long time, but the way it mentions is fundamentally different. It is not the clock frequency that grows to an unattainable speed, but rather the benefits of increasing the transistor density by using multicore. Programs that run on a new processor must be rewritten in a concurrent manner to take full advantage of its performance.
Most developers hear "concurrency" and often think of multithreaded programs at once. For now, multithreaded execution is the most common way to take advantage of multicore systems. Although multithreaded programming is much better than "sequential" programming, even a careful programmer does not have the best concurrency in the code. Programming languages should do better in this regard, and most of the modern programming languages that are widely used will support multithreaded programming.
The fact of the accident
Now let's look at the crux of the problem. To take advantage of multicore systems, Python must support multi-threaded operations. As an interpreted language, Python's interpreter must be both safe and efficient. We are all aware of the problems that multithreaded programming can encounter. The interpreter should be aware of the data that is shared within different thread operations. It also ensures that there is always a maximum of compute resources when managing user threads.
So, what is the protection mechanism of data when different threads are accessing it at the same time? The answer is the interpreter global lock. From the name we can tell a lot of things, obviously, this is a global (from the interpreter's point of view) lock that is added to the interpreter (from the perspective of mutual exclusion or similar). This approach is certainly safe, but it has a hidden meaning (this is what Python beginners need to know): For any Python program, no matter how many processors, there is always only one thread executing.
Many people find this fact by accident. Many of the discussion groups and message boards on the web are filled with questions like those from Python beginners and experts-"Why is my new multithreaded Python program running slower than it has a single thread?" "Many people are still very dizzy when asking this question, because it is obvious that a program with two threads is faster than it has only one thread (assuming that the program is really parallel). In fact, the question has been asked so frequently that Python experts have crafted a standard answer: "Don't use multi-threading, use multiple processes." But the answer is more confusing than that one. Can't I use multithreading in Python? How bad is the use of multithreading in a popular language like Python, which even experts recommend not to use. Did I really miss something?
It's a pity that nothing has been missed out. Because of the design of the Python interpreter, using multithreading to improve performance should be a difficult task. In the worst case, it will slow down (and sometimes obviously) the speed at which your program runs. A novice student in computer science and technology can tell you what happens when multiple threads are competing for a shared resource. Results are usually not ideal. Multithreading works well in many cases, and there may be no excessive complaining about Python multithreading performance for interpreter implementations and kernel developers.
What should I do now? Panic?
So, what's this going to do? Did you solve the problem? Do we, as Python developers, mean abandoning the idea of using multithreading to explore parallelism? Why anyway, Gil needs to make sure that only one thread is running at some point? Can't you add fine-grained locks to block simultaneous access to multiple independent objects? And why didn't anyone try something like that before?
These practical questions have a very interesting answer. The Gil provides protection for access to things such as the current thread state and the heap allocation object used for garbage collection. However, this is nothing special for the Python language, it needs to use a Gil. This is a typical product of this implementation. Now there are other Python interpreters (and compilers) that do not use the Gil. Although, for CPython, there have been many interpreters that do not use the Gil since its advent.
So why not ditch Gil? Many people may not know that in 1999, a "free threading" patch that was often mentioned but not understood in Python 1.5 has tried to implement the idea, a patch from Greg Stein. In this patch, the Gil is completely removed and replaced with a fine-grained lock. However, the removal of the Gil brings a cost to the execution speed of the single-threaded process. When executed with a single thread, the speed is reduced by approximately 40%. The increase in speed was demonstrated with two threads, but in addition to this increase, the gain did not grow linearly with the increase in the number of cores. Due to slow execution, the patch was rejected and almost forgotten.
It's very difficult to remove Gil, let's go shopping!
(Translator Note: XXX is hard.) Let's go shopping! in English is similar to the Chinese growling body. The implication is that it's very difficult to accomplish something successfully, so let's go straight to the third-party product substitution. )
However, the "free threading" patch is instructive, proving a basic point about the Python interpreter: It is very difficult to remove the Gil. Because of the age at which the patch was released, the interpreter became more dependent on the global state, making it more difficult to remove today's Gil. It is worth mentioning that it is for this reason that many people are more interested in trying to remove Gil. Difficult questions tend to be interesting.
But this may be a bit misleading. Let's think about it: what happens if we have a magic patch that removes the Gil and doesn't have a performance drop on the single-threaded python code? We're going to get what we always wanted: a threading API might take advantage of all the processors at the same time. So now, we've got what we want, but is this really a good thing?
Thread-based programming is undoubtedly difficult. Whenever someone feels that he knows everything about how a thread works, there will always be some new problems silently. Because it's really hard to get the right consistency in this area, there are some very well-known language designers and researchers who have summed up some threading models. Just as someone who has written a multithreaded application can tell you, whether it's a multithreaded application development or debugging will be a few times more difficult than a single-threaded application. Programmers usually have the sequential execution of the thinking mode is exactly the parallel execution mode does not match. The Gil's appearance inadvertently helped the developers avoid getting into trouble. In cases where it is still necessary to synchronize primitives when using multi-threading, the Gil actually helps us maintain data consistency issues between different threads.
So now it seems that the most difficult question about Python is a bit of a wrong question. We have very good reasons why Python experts recommend that we use multiple processes instead of multiple threads instead of trying to hide the lack of Python thread implementations. Further, we encourage developers to implement concurrency models in a safer and more straightforward way, while preserving the use of multithreading for development unless you feel really necessary. For most people, what is the best parallel programming model may not be very clear. But at the moment we know that multithreading may not be the best way to do it.
As for Gil, don't assume that its presence there is static and non-analytical. Antoine Pitrou implemented a new Gil in Python 3.2, with some positive results. This is one of the most important changes in the Gil since 1992. The change is huge and difficult to explain here, but from a higher level, the old Gil determines when to abandon the Gil by counting the python instructions. As a result, a single Python instruction will contain a lot of work, that is, they are not translated into machine instructions by 1:1. In the new Gil implementation, a fixed timeout period is used to indicate the current thread to discard the lock. This lock is maintained at the current thread, and when the second thread requests the lock, the current thread is forced to release the lock after 5ms (that is, the current thread checks to see if it needs to release the lock every 5ms). This makes switching between threads more predictable when the task is feasible.
However, this is not a perfect change. David Beazley is probably the most active researcher in the field of effective use of Gil in various types of tasks. In addition to the most thorough study of the Gil before Python 3.2, he also studied the latest Gil implementations and found many interesting program scenarios. For these programs, even the new Gil implementation, its performance is pretty bad. He is still leading and advancing the Gil discussion through some practical research and the release of some experimental results.
Regardless of how a person feels about the Python Gil, it is still the most difficult technical challenge in the Python language. To understand its implementation requires a very thorough understanding of operating system design, multithreaded programming, C language, interpreter design, and the implementation of the CPython interpreter. These requirements alone prevent many developers from studying the Gil more thoroughly. Nonetheless, there is no indication that Gil will be away from us any time soon. For now, it will continue to confuse and surprise those who are new to Python and at the same time are interested in solving very difficult technical problems.
The above is written based on my current research on the Python interpreter. While I would like to write some other aspects of the interpreter, none is more known than the Global Interpreter lock (GIL). Although I think some of the content here is inaccurate, these technical details are different from the many resource entries in CPython. If you find something inaccurate, please let me know in a timely manner so that I can correct it as soon as possible.
The most difficult question for Python