Python Greenlet Implementation principles and usage examples

Python Greenlet Implementation principles and usage examples _python

Last Update:2017-01-19 Source: Internet

Author: User

Tags in python advantage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently began to study Python's parallel development technology, including multithreading, multiple processes, and so on. Gradually sorted out some of the information on the web, today, sorted out Greenlet related information.

Technical background of concurrent processing

Parallel processing is now very much appreciated, because in many cases, parallel computing can greatly improve the system throughput, especially in the current era of multi-core multiprocessor, so the old language like Lisp was again taken up by people, functional programming is increasingly popular. Describes a library of parallel processing in Python: Greenlet. Python has a very famous library called Stackless, used to do concurrent processing, mainly to get a thread called Tasklet, and the biggest difference between Greenlet and stackless is that he is lightweight? Not enough, the biggest difference is that Greenlet needs you to handle thread switching, that is, you need to specify which Greenlet to execute now and which greenlet to execute.

The realization mechanism of Greenlet

Using Python to develop a Web program, you've been using fastcgi mode. Then a number of threads are started in each process to request processing. There is a problem here is the need to ensure that each request response time is particularly short, otherwise, as long as multiple requests slow will let the server denial of service, Because there is no thread to respond to the request. Usually our service online will perform performance testing, so the normal situation is not too big problem. But it's impossible to test all the scenarios. If it appears, it will allow users to wait for a long time without a response. Partial unavailable causes all to be unavailable. Greenlet in Python. So it's a simple understanding of its implementation mechanism.

Each greenlet is just a Python object (Pygreenlet) in heap. So for a process you create millions or even thousands of greenlet are no problem.

Copy Code code as follows:

typedef struct _GREENLET {
Pyobject_head
char* Stack_start;
char* Stack_stop;
char* stack_copy;
intptr_t stack_saved;
struct _greenlet* Stack_prev;
struct _greenlet* parent;
pyobject* Run_info;
struct _frame* top_frame;
int recursion_depth;
pyobject* weakreflist;
pyobject* Exc_type;
pyobject* Exc_value;
pyobject* Exc_traceback;
Pyobject* dict;
} Pygreenlet;

Each greenlet is actually a function and the context in which it is stored. For a function, the context is the stack. All greenlets of the same process share a common operating system-allocated user stack. So the same moment can only have stack data not conflicting greenlet use this global stack. Greenlet is through Stack_stop,stack_ Start to hold the stack bottom and stack top of its stack, if there is a stack_stop of the Greenlet to be executed and the Greenlet overlap in the current stack, It is necessary to temporarily save these overlapping greenlet data to the heap. The saved locations are recorded by Stack_copy and stack_saved, so that when they are restored, they are copied back from the heap to the stack stack_stop and Stack_ The position of the start. Otherwise, the stack data will be corrupted. So the applications create these greenlet by constantly copying data into the heap or copying them from the heap to the stack to achieve concurrency. It's really comfortable for IO-type applications to use Coroutine.

The following is a simple stack space model for Greenlet (from greenlet.c)

Copy Code code as follows:

A pygreenlet is a range of C stack addresses the must be
Saved and restored in such a way this full range of the
Stack contains valid data when we switch to it.

Stack layout for a greenlet:

| ^^^       |
| Older Data |
| |
Stack_stop. |_______________|
.               | |
. | Greenlet Data |
.   | In Stack |
. * |_______________| . . _____________ stack_copy + stack_saved
.               |     |             | |
.     |     Data | |greenlet data|
.   |     Unrelated |    | Saved |
.      |     to |   | In Heap |
Stack_start.     | This |. . |_____________| Stack_copy
| Greenlet |
| |
| Newer Data |
| VVV |

The following is a simple Greenlet code.

Copy Code code as follows:

From Greenlet import Greenlet

def test1 ():
Print 12
Gr2.switch ()
Print 34

Def test2 ():
Print 56
Gr1.switch ()
Print 78

Gr1 = Greenlet (test1)
GR2 = Greenlet (test2)
Gr1.switch ()

The current discussion of the coprocessor is generally supported by the programming language. The languages I know to provide support for the process include Python,lua,go,erlang, Scala and rust. A coprocessor differs from a thread in that it is not switched by the operating system, but by programmer code, which means that the switch is controlled by the programmer, so that there is no thread-like security problem.

All of the processes share the context of the entire process, so the exchange between the threads is also very convenient.

Compared to the second scenario (I/O multiplexing), the program that uses the orchestration writes will be more intuitive, rather than splitting a complete process into multiple managed event processing. The disadvantage of a coprocessor may be that it cannot take advantage of multi-core advantages, but this can be solved by means of a coprocessor + process.

A coprocessor can be used to handle concurrency to improve performance, or it can be used to implement a state machine to simplify programming. I used more of the second one. At the end of last year, we contacted Python to learn about the Python's concept of the coprocessor, and later the Pycon china2011 contact with the processing Yield,greenlet is also a coprocessor, and it seems to me to be a more available scenario, especially for processing state machines.

At present, this piece has been basically completed, after the time to sum up.

To sum up:

1 multi-process can take advantage of multi-core, but interprocess communication is troublesome, in addition, the increase in the number of processes will cause performance degradation, process switching cost is higher. Program flow complexity is relatively low compared to I/O multiplexing.

2 I/O multiplexing is to handle multiple logical processes within a process, without process switching, performance is high, and the sharing of information between processes is simple. But can not take advantage of multi-core advantages, in addition, program flow by event processing cut into small pieces, the program is more complex, difficult to understand.

3 thread running in a process, by the operating system scheduling, switching costs are lower, in addition, they share the process of virtual address space, the sharing of information between threads simple. However, thread-safety issues cause the thread learning curve to be steep and prone to error.

4 The programming language provided by the programmer is controlled by programmers to switch, so there is no thread safety problem, which can be used to process state machine, concurrent request and so on. But the multi-core advantage cannot be exploited.

The above four kinds of schemes can be used in conjunction with, I am more optimistic about the process + the mode of the coprocessor.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More