What is Python's global Interpretation Lock (GIL)

Source: Internet
Author: User
Tags garbage collection mutex

Gil solves what's wrong with Python?

Why choose Gil as a solution?

Impact on multi-threaded Python programs

Why has Gil not been deleted?

Why is Gil not removed in Python 3?

How do I deal with the Gil in Python?

What we call a Python global Interpretation lock (GIL) is simply a mutex (or lock) that allows only one thread to control the Python interpreter.

This means that only one thread is in the execution state at any one point in time. The Gil has no significant impact on the programs that perform single-threaded tasks, but it becomes a performance bottleneck for compute-intensive (cpu-bound) and multithreaded tasks.

Since Gil is allowed to run only one thread at a time, even in multi-threaded frames with multiple CPU cores, its reputation as "notorious" in Python's many features.

In this article, you will learn how the Gil affects the performance of your Python program and how it can mitigate its impact on your code.

Gil solves what's wrong with Python?

Python uses reference counting for memory management, which means that objects created in Python have a reference count variable to track the number of references to that object. When the quantity is 0 o'clock, the memory occupied by the object is freed.

Let's show how the reference count works with a simple code:

In the above example, the reference count for an empty list object [] is 3. The list object is referenced by a, B, and a parameter passed to Sys.getrefcount ().

Back to the Gil itself:

The problem is that this reference counting variable needs to be protected from the race condition when two threads increase or decrease at the same time. If this happens, it can cause the leaked memory to never be freed, or, more seriously, to free memory incorrectly when a reference to an object still exists. This can cause a Python program to crash or introduce a variety of weird bugs.

By adding locks to data structures that are shared across threads so that they are not inconsistently modified, it is good to keep reference count variables safe.

However, adding a lock to each object or group of objects means that there are multiple locks, which also leads to another problem-the deadlock (which occurs only when there are multiple locks). Another side effect is the performance degradation caused by repeated acquisition and release of Locks.

The Gil is a single lock on the interpreter itself, and the addition of a rule indicates that any Python bytecode execution requires an interpretation lock. This effectively prevents deadlocks (because there is only one lock) and does not result in too much performance overhead. But it does make every compute-intensive task a single thread.

Gil is also used by other language interpreters (such as Ruby), but this is not the only way to solve this problem. Some programming languages avoid the Gil's request for thread-safe memory management by using methods other than reference counting, such as garbage collection.

On the other hand, this also means that these languages often need to add other performance-enhancing features, such as the JIT compiler, to compensate for the Gil single-threaded performance advantage.

Why choose Gil as a solution?

So why use this seemingly stumbling block technique in python? Is this a bad decision for a python developer?

As Larry Hasting says, the Gil design decision is one of the key reasons Python is now being hotly sought after.

Python always exists when the operating system does not yet have the concept of threading. The python design was designed to be easy to use for faster development, which also allowed more and more programmers to start using Python.

Many extensions have been written to the C library for those functions required by Python, and in order to prevent inconsistencies, these C extensions require thread-safe memory management, which Gil provides.

Gil is very easy to implement and easy to add to Python. Because you only need to manage one lock, there is a performance boost for single-threaded tasks.

Non-thread-safe C libraries became easier to integrate, and these C extensions became one of the reasons Python was accepted by different communities.

As you can see, Gil is a practical solution for CPython developers to face difficult problems in their early Python career.

Impact on multi-threaded Python programs

When you look at some typical Python programs or any computer program, you will find that the performance of a program for compute-intensive and I/o-intensive tasks varies.

Compute-intensive tasks are those that push the CPU to the limit. This includes the process of mathematical calculation, such as matrix multiplication, search, image processing and so on.

I/O intensive tasks are tasks that take time to wait for input and output from users, files, databases, networks, and so on. I/O intensive tasks sometimes wait very long until they get the content they need from the data source. This is because the data source itself needs to be processed first before it is ready for input and output. For example, a user considers a database query that is entered in the input prompt or runs in its own process.

Let's take a look at a simple computationally intensive program that performs a countdown:

Run on my 4-core system to get the following output:

Next I make a fine-tuning of the code, using two threads of parallel processing to complete the countdown:

Then I run it again:

As you can see, the two versions have a similar finish time. In a multithreaded version, the Gil blocks the compute-intensive task threads from executing concurrently.

The Gil has little impact on the performance of I/O intensive task multithreaded programming because locks can be shared across multiple threads while waiting for I/O.

However, for a thread that is fully computationally intensive (for example, using a thread for partial image processing), it will not only become a single-threaded task due to a lock, but also significantly increase execution time. As in the previous example, the multi-threaded result is compared to a full single thread.

This increase in execution time is due to the acquisition and release overhead associated with the lock.

Why has Gil not been deleted?

Python developers have received a lot of complaints about this, but the very popular language like Python is unable to make a great change in the removal of Gil without causing backward incompatibility.

The Gil can obviously be removed, and in the past this task has been done many times by developers and researchers. But all attempts to break up the C expansion market largely depend on the Gil offering the solution.

There are, of course, many other solutions that can solve the Gil problem, but some are at the expense of performance at the expense of single-threaded and multithreaded I/o-intensive tasks, while others are too complex. After all, you don't want your python to run slower after the new version is released.

Guido van Rossum, founder of BDFL of Python, responded to the community in the September 2007 article, "T easy to remove the GIL":

"If the performance of single-threaded and multithreaded I/O-intensive tasks does not degrade, I would very much like a set of patches to appear in py3k. ”

Of course, every attempt thereafter did not satisfy this condition.

Why is Gil not removed in Python 3?

There is indeed a chance in Python3 that many functions start from scratch, and in this process break the C extension that needs to be changed and updated and port it to Python 3. This is why the earlier version of Python 3 was adopted by the community for a slower reason.

But why didn't Gil be deleted?

Deleting the Gil will make Python 3 slower than Python 2 in handling single-threaded tasks, and you can imagine what results will be produced. You can't deny the single-threaded performance advantage that Gil brings, which is why the Gil is still in Python 3.

But Python 3 does make significant improvements to the existing Gil.

We have only discussed the impact of the Gil on "compute-intensive tasks only" and "I/O intensive tasks", but what about those that are part of the compute-intensive part of a thread that is I/O intensive?

In such a program, the Gil of Python makes I/o-intensive threads paralyzed by not allowing I/O intensive threads to fetch Gil from computationally intensive threads.

This is because Python has a mechanism embedded in it that forces threads to release the Gil after a fixed period of continuous use, and if no one gets the Gil, the same thread can continue to use.

The problem with this mechanism is that most compute-intensive threads get the Gil again before another thread gets the Gil. This work is done by David Beazley, and you can get visual resources here.

Antoine Pitrou solved the problem in Python3.2 in 2009, adding a mechanism to see how many other threads requested the Gil's access, and not allowing the current thread to regain the Gil before another thread had a chance to run when the number dropped.

How do I deal with the Gil in Python?

If Gil bothers you, you can try the method:

Multi-Process vs Multithreading: The most popular approach is to apply a multi-process approach in which you use multiple processes instead of multiple threads. Each Python process has its own Python interpreter and memory space, so the Gil does not become a problem. Python has a multiprocessing module that can help us easily create multiple processes:

Run on the system to get

Performance has improved compared to multi-threaded versions.

But time does not fall to half of our previous version, because process management has its own overhead. Multi-process is more "heavy" than multithreading, so keep in mind that this can be a bottleneck for scale.

Alternative Python interpreter: There are several interpreter implementations in Python, and the Cpython,jpython,ironpython and pypy written in c,java,c# and Python are the most popular. Gil exists only in the traditional Python implementation method such as CPython. If your program and its library files can be implemented in other ways, you can also try it.

Wait and see: Many users use Gil to improve the performance of single-threaded tasks. Of course, multithreaded programmers don't have to worry about it, because some of the brains inside the Python community are working to remove Gil from CPython. One of the attempts is giletomy.

The Python Gil is often considered a mysterious and difficult topic. But remember, as a Python supporter, the Gil will only be affected if you are writing a C extension or if you have computationally intensive multithreaded tasks in your program.

In this case, this article should give you everything you need to know what Gil is and how to handle it in your own project. If you want to understand the low-level internal operation of the GIL, I suggest you watch David Beazley's understanding the Python Gil.

What a Python Gil is, how multithreaded performance really

The blogger often hears the word Gil when he first touches python, and finds that the word is often equated with Python's inability to efficiently implement multithreading. In the spirit not only to know it, but also to know its research attitude, Bo Master collected all aspects of the information, spent a week a few hours of leisure time in-depth understanding of the next Gil, and summarized into this article, but also hope that readers can pass this article better and objective understanding Gil.

Article welcome reprint, but reproduced when please retain this paragraph text, and placed on the top of the article Lu Junyi (Cenalulu) This article address: http://cenalulu.github.io/python/gil-in-python/

What's Gil?

The first thing to be clear is GIL not the Python feature, which is a concept introduced when implementing the Python parser (CPython). Just like C + + is a set of language (syntax) standards, but can be compiled into executable code with different compilers. Well-known compilers such as Gcc,intel c++,visual C + +. Python is the same, and the same piece of code can be executed through different Python execution environments such as Cpython,pypy,psyco. Like the Jpython there is no Gil. However, because CPython is the default Python execution environment for most environments. So in many people's concept CPython is Python, also take it for granted that the GIL Python language defects. So let's be clear here: Gil is not a python feature, Python can be completely independent of the Gil

So what is the Gil in CPython implementation? Gil full name Global Interpreter Lock to avoid misleading, let's take a look at the official explanation:

In CPython, the global interpreter lock, or GIL, was a mutex that prevents multiple native threads from executing Python by Tecodes at once. This lock is necessary mainly because CPython ' s memory management are not thread-safe. (However, since the GIL exists, other features has grown to depend on the guarantees that it enforces.)

Okay, does it look bad? A mutex that prevents multi-threaded concurrent execution of machine code, at first glance, is a bug-like global lock! Don't worry, we are analyzing slowly below.

Why would there be Gil ?

Due to physical constraints, each CPU vendor's game on the core frequency has been replaced by multicore. In order to make more efficient use of multi-core processor performance, there is a multi-threaded programming, and the resulting is the data consistency between the threads and state synchronization difficulties. Even if the cache inside the CPU is no exception, in order to effectively solve the data synchronization between multiple caches, the manufacturers spent a lot of effort, but also inevitably brought some performance loss.

Python, of course, can not escape, in order to take advantage of multicore, Python began to support multithreading. the simplest way to solve data integrity and state synchronization between multiple threads is to lock them up. so with the Gil this super lock, and when more and more code base developers accept this setting, they start to rely heavily on this feature (that is, the default Python internal objects are thread-safe, without having to consider additional memory locks and synchronous operations when implemented).

Slowly this realization was found to be egg-sore and inefficient. But when you try to split and remove the Gil, it's hard to get rid of a lot of library code developers who are heavily dependent on Gil. How hard is that? To make an analogy, a "small project" such as MySQL, in order to split the buffer Pool mutex this large lock into small locks also took from 5.5 to 5.6 to more than 5.7 large version for nearly 5 years, and continues. What's so hard about MySQL, which is backed by a company and has a fixed development team, not to mention the highly community-based team of core development and code contributors like Python?

So simply saying that Gil's existence is more of a historical reason. Multi-threaded problems still have to be faced if pushed back, but at least it will be more elegant than the current Gil.

All the above is for the network

What is Python's global Interpretation Lock (GIL)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.