Unladen swallow Project Plan: Increase Python speed by 5 times

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note: according to the two Release versions, the performance improvement is only about 10%. The gap between the performance and the target is too large...

--------------------------------------

Unladen swallow Project Plan: Increase Python speed by 5 times

Source: Linux Forum Date: 2009.03.29 (total7Comments) I want to comment

Google's Python engineer released a new project to accelerate python by at least five times. The name of the new project is unladen swallow, which is intended to find a new Python explanation. Program Virtual Machine, the new JIT compilation engine. The goal in the first quarter is to achieve a performance improvement of 25-35%, which has been completed, Code Published on the Google Code website. For more information, see
Target

We want to make Python faster. At the same time, we also hope to make the unladen swallow project available for large and well-existing applications without suffering.

1. Create a New Python version that is at least five times faster than cpython.
2. The performance of Python applications should be very stable.
3. MaintainSource codeLevel of compatibility.
4. maintain compatibility with the cpython extension module at the source code level.
5. We do not want to maintain a long-term Python implementation. We regard this project as a development branch rather than a version Branch (fork ).

Project Overview

To achieve our goal of performance and compatibility, we chose to modify cpython instead of developing this implementation from scratch. It is worth noting that we choose to start with cpython 2.6.1: Python 2.6 and 2.4/2.5 (currently used for most valuable applications) and Python 3.0 (the ultimate version in the future) can coexist. Starting with a cpython version, we can avoid re-implementing a large number of built-in functions, objects, and standard library modules. At the same time, we can reuse some existing and commonly used cpython C Language extension APIs. Migrating existing applications from a 2.x cpthon makes it easier for us to migrate existing applications. starting from X and requiring large application maintainers to migrate their programs first is impractical for our project audience.

Our main task is to focus on improving the execution speed of Python code, rather than making too much effort on the python Runtime Library. Our long-term plan is to use a JIT created on the basis of llvm to replace the traditional cpython virtual machine, while minimizing the impact on other parts of the python running mode. Through observation, we found that python applications spend a lot of running time in the main eval loop. In particular, minor adjustments to virtual machine components such as opcode Scheduling (opcode dispatch) can also have a major impact on the Running Performance of Python. We believe that compiling Python code into machine code through the JIT engine of llvm will bring more benefits.

Some notable benefits:

* Turning to JIT allows us to convert python from a stack-based machine to a register-based machine ), practice has proved that this change improves the performance of another similar language.
* Not to mention anything else, simply eliminating the need to send and receive operation codes (Opcodes) is a victory. For more information, see http://bugs.python.org/issue4753.
* The current cpython virtual machine operation code acceptance/sending restrictions make further performance optimization almost impossible. For example, we want to implement type feedback and dynamic recompilation Ala self-93 ), however, we believe that using the binary code compiled by cpython to implement the multi-state inline high-speed cache (polymorphic inline caches) will be unacceptable.
* Llvm is particularly worth noting. This is because it is easy to use to generate code functions (codegen) for multiple platforms, and it has the ability to compile C and C ++ into the same intermediate code-this is exactly what we want to bring to Python. It makes inlining and analysis possible to eliminate the obstacle between the current Python and C.

With the framework for generating machine code, we can compile python into a more efficient implementation. Take the following code as an example:

For I in range (3 ):

Foo (I)

At present, it will be translated as this inefficient

$ X = range (3)

While true:

Try:

I = $ X. Next ()

Optional t stopiteration:

Break

Foo (I)

Once we have a way to know that range () represents the range () built-in function, we can change it to something like this.

For (I = 0; I <3; I ++)

Foo (I)

In C language, the unboxed data type can be used for mathematical operations.

Foo (0)

Foo (1)

Foo (2)

We intentionally designed the unladen swallow internal structure to support multiple kernels. Servers will only have more and more kernels in the future. We need to explore this point so that more work can be done in the parallel structure. For example, we can use a kernel as a parallel optimizer, which can perform increasingly expensive (important) code optimization during code execution, and use another kernel to execute the Code itself. We are also considering implementing a parallel GC and using another kernel to release the memory module. Since most industrial servers have 4 to 32 cores, we believe the benefits of this optimization are a potential fortune. However, we still need to pay attention to the needs of highly parallel application procedures, rather than blindly consuming these kernels.

Emphasize that many fields have been considered or implemented by some other dynamic languages, such as jruby, rubinius, and parrot, including Jython, other Python implementations such as pypy and ironpython. We are looking for debugging information, regular expression performance, and other ideas to improve dynamic language performance from these other implementations. This is a path that has been taken by many people. We need to try to avoid the dilemma of re-inventing the wheel. Plan Blueprint

Unladen swallow will release a new version every three months and fix the Bug During the release.

2009 phase 1 (Q1)

Q1 is mainly used to make minor changes to the cpython Implementation of the video memory. Our goal is to achieve 25-35% performance improvement on the current base line. The goal of this phase is to be relatively conservative. We want to give the client applications visible performance optimizations as quickly as possible, rather than waiting for them to wait until the entire project is completed.

2009 stage 2 (Q2)

Q2 will focus on abolishing Python virtual machines and replacing them with llvm-based implementations with the same features. We expect some performance improvement, but it is not the main task of 2009q2. We mainly want to get something that can run on llvm. It is a person after this stage.

2009 stage 3 (Q3) and future

Tasks starting from Q3 will be "simple" to do these jobs well. We do not want to do original work, but try to use the research results of the past 30 years as much as possible. Go to the relevant papers to view part of the list of papers we intend to implement (far less than all ).
We plan to emphasize the considerations of the regular engine's other expansion modules identified as performance bottlenecks. However, regular expressions have been identified as a good goal and will be the first field for optimization.
In addition, we intend to remove the Gil and multithreading status of Python. We believe that by implementing a more advanced GC, this can be achieved, similar to IBM recycler.
Our long-term goal is to make Python faster and replace those types implemented using C for speed with Python again.
2009q3 accurate performance optimization goals will be determined during Q2. Http://code.google.com/p/unladen-swallow/wiki/Proj-ectplan

Http://danmarner.yo2.cn/unladen-swallow-project-pl

Original: Google Translator: danmarner

You are welcome to reprint the original/translated link.

======================================

Unladen swallow Project Plan-optimized Python plan
Note: For links to all referenced materials, see related papers.

Source: http://danmarner.yo2.cn/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Unladen swallow Project Plan: Increase Python speed by 5 times

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Unladen swallow Project Plan: Increase Python speed by 5 times

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support