Python Greenlet background introduction and implementation mechanism

Source: Internet
Author: User

Technical background of concurrent processing

Parallelization is now very important, because in many cases, parallel computing can greatly improve system throughput, especially in today's multicore multiprocessor era, so the ancient language like Lisp has been re-taken up, functional programming is becoming more and more popular. A library that describes the parallel processing of a python: greenlet. Python has a very famous library called stackless , used to do concurrent processing, mainly to get a micro-thread called Tasklet, and Greenlet and stackless the biggest difference is that he is very lightweight? Not enough, the big difference is that Greenlet needs you to handle thread switching yourself, which means you need to specify which Greenlet to execute now and which greenlet to execute.

The implementation mechanism of Greenlet

Previously, using Python to develop a Web program, you have been using FASTCGI mode. Multiple threads are then started in each process for request processing. Here is a problem is to ensure that each request response time is particularly short, otherwise, as long as the request a few times slow will let the server denial of service, Because no thread is able to respond to the request. Usually our service goes on the performance test on the line, so the normal situation is not too big. But it's not possible to test all the scenarios. Once it appears, the user waits for a long time without responding. The partial unavailability causes all to be unavailable. Greenlet under Python. So a simple understanding of its implementation mechanism.
Each greenlet is just a Python object (Pygreenlet) in the heap. So for a process you create millions or even thousands of greenlet without problems.

1234567891011121314151617 typedef struct _greenlet {    PyObject_HEAD    char* stack_start;    char* stack_stop;    char* stack_copy;    intptr_t stack_saved;    struct _greenlet* stack_prev;    struct _greenlet* parent;    PyObject* run_info;    struct _frame* top_frame;    int recursion_depth;    PyObject* weakreflist;    PyObject* exc_type;    PyObject* exc_value;    PyObject* exc_traceback;    PyObject* dict;} PyGreenlet;

each greenlet is actually a function, and the context in which the function is executed. For functions, the context is its stack. All greenlets of the same process share a common operating system-assigned user stack. So only the Greenlet with stack data at the same time can use this global stack. Greenlet is through Stack_stop,stack_ Start to save the stack bottom of its stack and the top of the stack, if there is a stack_stop of the greenlet that will be executed and the Greenlet overlap in the current stack, It is necessary to temporarily save the data from these overlapping greenlet stacks to the heap. The saved location is recorded by Stack_copy and stack_saved so that the recovery is copied back from the heap stack_stop and Stack_ The position of the start. Otherwise, the stack data will be destroyed. So the Greenlet that the application creates is concurrency by copying data into the heap or copying it from the heap to the stack. For IO-type applications It's really comfortable to use coroutine.

Here is a simple stack space model for Greenlet (from greenlet.c)

1234567891011121314151617181920212223 A PyGreenlet is a range of C stack addresses that must besaved and restored in such a way that the full range of thestack contains valid data when we switch to it.Stack layout for a greenlet:               |     ^^^       |               |  older data   |               |               |  stack_stop . |_______________|        .      |               |        .      | greenlet data |        .      |   in stack    |        .    * |_______________| . .  _____________  stack_copy + stack_saved        .      |               |     |             |        .      |     data      |     |greenlet data|        .      |   unrelated   |     |    saved    |        .      |      to       |     |   in heap   | stack_start . |     this      | . . |_____________| stack_copy               |   greenlet    |               |               |               |  newer data   |               |     vvv       |

Here is a simple Greenlet code:

From Greenlet import Greenlet def test1 (): Print    gr2.switch () print +    def test2 ():    print    Gr1.switch ()    Print Gr1 = Greenlet (test1) GR2 = Greenlet (test2) Gr1.switch ()

The process currently discussed is generally supported by programming languages. At the moment I know that the languages available for the support include Python,lua,go,erlang, scala and rust. The process is different from the thread is that the process is not the operating system to switch, but by the programmer code to switch, that is, the switch is controlled by the programmer, so that there is no thread so-called security issues.
All the processes share the context of the entire process, so that the interchange between the threads is also very convenient. The
is relative to the second scenario (I/O multiplexing), which makes it more intuitive to use a program written by the coprocessor, rather than splitting a complete process into multiple managed event handlers. The disadvantage of the
Association may be that it cannot take advantage of multicore advantages, but this can be solved by the way of the coprocessor + process. The
Coprocessor can be used to handle concurrency to improve performance, or it can be used to implement state machines to simplify programming. I used more of the second one. At the end of last year, I contacted Python, learned about the concept of the Python process, and later contacted by pycon china2011 to deal with the Yield,greenlet is also a solution, and in my opinion is a more usable one, especially to deal with the state machine.
Currently this piece has been basically completed, the back of the time to summarize.

To summarize :
1) Multi-process can take advantage of multi-core advantages, but interprocess communication is cumbersome, in addition, the increase in the number of processes will cause performance degradation, process switching costs higher. Program process complexity is low relative to I/O multiplexing.
2)I/O multiplexing is the processing of multiple logical processes within a process, without process switching, high performance, and the sharing of information between processes is simple. However, the multi-core advantage can not be exploited, and the program process is cut into small pieces by the event processing, the program is complicated and difficult to understand.
3) The thread runs inside a process, is scheduled by the operating system, the switching costs are lower, and they share the virtual address space of the process, and the information is simple to share among the threads. But thread-safety issues cause the thread learning curve to be steep and error-prone.
4) The process has a programming language provided by the programmer control to switch, so there is no thread security problems, can be used to handle state machines, concurrent requests and so on. However, the multi-core advantage cannot be exploited.
The above four kinds of schemes can be used together, I am more optimistic about the process + co-operation mode

Python Greenlet background introduction and implementation mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.