Python Greenlet Usage Introduction and implementation principle analysis

Source: Internet
Author: User
Recently began to study Python's parallel development techniques, including multi-threading, multi-process, and so on. Gradually collated some of the information on the Internet, today collated a bit of greenlet related information.

Technical background of concurrent processing

Parallelization is now very important, because in many cases, parallel computing can greatly improve system throughput, especially in today's multicore multiprocessor era, so the ancient language like Lisp has been re-taken up, functional programming is becoming more and more popular. A library that describes the parallel processing of a python: Greenlet. Python has a very famous library called Stackless, used to do concurrent processing, mainly to get a micro-thread called Tasklet, and Greenlet and stackless the biggest difference is that he is very lightweight? Not enough, the big difference is that Greenlet needs you to handle thread switching yourself, which means you need to specify which Greenlet to execute now and which greenlet to execute.

The implementation mechanism of Greenlet

Previously, using Python to develop a Web program, you have been using FASTCGI mode. Multiple threads are then started in each process for request processing. Here is a problem is to ensure that each request response time is particularly short, otherwise, as long as the request a few times slow will let the server denial of service, Because no thread is able to respond to the request. Usually our service goes on the performance test on the line, so the normal situation is not too big. But it's not possible to test all the scenarios. Once it appears, the user waits for a long time without responding. The partial unavailability causes all to be unavailable. Greenlet under Python. So a simple understanding of its implementation mechanism.

Each greenlet is just a Python object (Pygreenlet) in the heap. So for a process you create millions or even thousands of greenlet without problems.

typedef struct _GREENLET {pyobject_headchar* stack_start;char* stack_stop;char* stack_copy;intptr_t stack_saved; struct _greenlet* stack_prev;struct _greenlet* parent; pyobject* run_info;struct _frame* top_frame;int recursion_depth; pyobject* weakreflist; pyobject* Exc_type; pyobject* Exc_value; pyobject* Exc_traceback; Pyobject* Dict;} Pygreenlet;

Each greenlet is actually a function, and the context in which the function is executed. For functions, the context is its stack. All greenlets of the same process share a common operating system-assigned user stack. So only the Greenlet with stack data at the same time can use this global stack. Greenlet is through Stack_stop,stack_ Start to save the stack bottom of its stack and the top of the stack, if there is a stack_stop of the greenlet that will be executed and the Greenlet overlap in the current stack, It is necessary to temporarily save the data from these overlapping greenlet stacks to the heap. The saved location is recorded by Stack_copy and stack_saved so that the recovery is copied back from the heap stack_stop and Stack_ The position of the start. Otherwise, the stack data will be destroyed. So the Greenlet that the application creates is concurrency by copying data into the heap or copying it from the heap to the stack. For IO-type applications It's really comfortable to use coroutine.

Here is a simple stack space model for Greenlet (from greenlet.c)

A pygreenlet is a range of C stacks addresses that must besaved and restored in such a It's the the full range of Thestack Contains valid data when we switch to it. Stack layout for a greenlet:               |     ^^^       |               |  Older Data   |               |               |  Stack_stop. |_______________|        .      |               |        .      | Greenlet Data |      |   In Stack    |        .    * |_______________| . .  _____________  stack_copy + stack_saved      .     | | | | | Data      |     | Greenlet data|        .      |   Unrelated   |     |    Saved    |        .      |      to       |     |   In heap   | stack_start |     This      |... |_____________| stack_copy               |   Greenlet               |  | | | Newer data   |               |     VVV       |

The following is a simple Greenlet code.

From Greenlet import Greenletdef test1 ():    print    gr2.switch ()    print 34def test2 ():    print    Gr1.switch ()    Print 78gr1 = Greenlet (test1) GR2 = Greenlet (test2) Gr1.switch ()

The currently discussed process is generally supported by programming languages. At the moment I know that the languages available for the support include Python,lua,go,erlang, Scala, and rust. The process is different from the thread is that the process is not the operating system to switch, but by the programmer Code to switch, that is, the switch is controlled by the programmer, so that there is no thread so-called security issues.

All of the processes share the context of the entire process, so that the interchange between the threads is also very convenient.

Compared to the second scenario (I/O multiplexing), the program that uses the co-write will be more intuitive, rather than splitting a complete process into multiple managed event handlers. The disadvantage of the process may be that it cannot take advantage of multicore advantages, however, this can be solved by the way of the coprocessor + process.

The process can be used to handle concurrency to improve performance, or it can be used to implement state machines to simplify programming. I used more of the second one. Contact Python at the end of last year, learned about the concept of the Python process, and later through the Pycon china2011 contact to deal with the Yield,greenlet is also a solution, and in my opinion is a more usable scenario, especially to deal with state machine.

At present, this piece has been basically completed, the back of the time to summarize.

To summarize:

1) Multi-process can take advantage of multi-core advantages, but interprocess communication is cumbersome, in addition, the increase in the number of processes will cause performance degradation, process switching costs higher. Program process complexity is low relative to I/O multiplexing.

2) I/O multiplexing is the processing of multiple logical processes within a process, without process switching, high performance, and the sharing of information between processes is simple. However, the multi-core advantage can not be exploited, and the program process is cut into small pieces by the event processing, the program is complicated and difficult to understand.

3) The thread runs inside a process, is scheduled by the operating system, the switching costs are lower, and they share the virtual address space of the process, and the information is simple to share among the threads. But thread-safety issues cause the thread learning curve to be steep and error-prone.

4) The process has a programming language provided by the programmer control to switch, so there is no thread security problems, can be used to handle state machines, concurrent requests and so on. However, the multi-core advantage cannot be exploited.

The above four kinds of schemes can be used together, I am more optimistic about the process + co-operation mode

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.