Gevent coroutine implementation principle

Source: Internet
Author: User

The reason for reading greenlet code implementation is to look at the implementation code of the gevent library... Then we know that the gevent coroutine is implemented based on greenlet... So let's look at the implementation of greenlet...

Here we will not talk about the specific implementation of greenlet. The key is the copy and copy of stack data and the shift of the stack pointer...


Gevent has its own I/O and scheduled cycle, so it adds a layer of extension to greenlet...

Here we will use the following code to give an example, and then analyze how gevent extends greenlet:

import geventdef hello(fname):print "hello : ", fnamegevent.sleep(0)print "12321 : ", fnametask1 = gevent.spawn(hello, "fjs1")task2 = gevent.spawn(hello, "fjs2")task1.join()

The output of this Code is as follows:



Well, let's take a look at how the spawn method creates a coroutine:

# Class method. This is to say that the layer structure provided by gevent @ classmethod def spawn (CLS, * ARGs, ** kwargs): "" return a new: class: 'greenlet' object, scheduled to start. the arguments are passed to: Meth: 'greenlet. _ init __'. "G = Cls (* ARGs, ** kwargs) # first construct the greenlet object G. start () # Call the start method, which is equivalent to registering a callback on the loop of the hub object. The function of this callback is to call the switch of the current greenlet to switch to the execute return G of this greenlet.

This method is a class method used to create a greenlet, but note that the current greenlet is not defined in the aforementioned greenlet library, it makes a simple extension...

Let's take a look at the constructor:

# Inherits greenlet, which is equivalent to extending some features of class greenlet (greenlet): "a light-weight cooperatively-scheduled execution unit. "def _ init _ (self, run = none, * ARGs, ** kwargs): hub = get_hub () greenlet. _ init _ (self, parent = hub) # here, all the parents of the created greenlet object are directed to this unique hub object if run is not none: Self. _ run = run # record run information self. ARGs = ARGs self. kwargs = kwargs self. _ links = deque () self. value = none self. _ exception = _ None self. _ notifier = none self. _ start_event = none

Here, the greenlet definition in the greenlet library is inherited directly, and there are important points in the constructor. We can see that, all the constructed coroutine parents point to a master coroutine named hub...

This is critical. It is the main cyclic coroutine of the entire gevent. The running of all created business coroutines depends on its scheduling and management...

Well, in the above spawn process, the start method is also called to start the coroutine. Let's take a look at the definition of this method:

# In fact, this mainly suspends a callback to be executed on the loop of the hub object. The callback function is to switch to the running def start (Self) of the greenlet ): "schedule the greenlet to run in this loop iteration" If self. _ start_event is none: # In fact, this only suspends a callback on the hub loop, and then executes the callback self in the hub loop. _ start_event = self. parent. loop. run_callback (self. switch) # Call the switch callback of the current greenlet in the loop of the hub object to start running the run method.

The code is still very simple. In fact, it is nothing more than registering a callback on the parent, that is, the hub loop, and this callback is the switch method of the current coroutine ,.. Then, when the callback is executed, that is, when the coroutine is executed...

Here we will not look at the implementation of the hub and Its loop... Let's first understand it as the main loop, managing all the callbacks, timing, and I/O events...


Well, let's take a look at the implementation of the join method .. All users familiar with multithreading know that in a multi-threaded environment, the join method is to block the current thread until the target thread of the join is returned... Of course, this is not a thread, but a coroutine ....

Let's take a look at the code for this join method:

# Suspend the current running environment and know that the greenlet of join has run def join (self, timeout = none): "" Wait until the greenlet finishes or * timeout * expires. return ''none' regardless. "If self. ready (): # if all of them have been run, return else: Switch = getcurrent () directly (). switch # obtain the switch self of the current greenlet. rawlink (switch) # register the greenlet callback of the current environment. After the execution of the greenlet to be waited, the try: t = timeout will be called back later. start_new (timeout) # create a timer object try: Result = self. parent. switch () # stop the execution of greenlet in the current environment, and schedule the hub to execute assert result is self, 'invalidswitch into greenlet. join (): % R' % (result,) Finally: T. cancel () # cancel timeout before t Timeout: Self. unlink (switch) # Remove if sys from the suspended callback. exc_info () [1] is not T: Raise failed T: Self. unlink (switch) raise

Call the getcurrent method to obtain the coroutine of the current environment, then obtain its switch method, and place it in the callback queue of the coroutine to join, after the coroutine to be joined is run, these Callbacks are called to restore the current coroutine...

We can see below that the parent, that is, the switch method of hub, is called and switched to the hub for execution. This will start executing the coroutine to join, in this case, the join coroutine is not directly switched to run .. Pay attention to this...

In addition, gevent's own greenlet definition is added with the run method, that is, each execution will start from here... The Code is as follows:

# Currently, the execution part of greenlet is actually to call the passed functions, and then call the suspended callback def run (Self): Try: If self. _ start_event is none: Self. _ start_event = _ dummy_event else: Self. _ start_event.stop () Try: Result = self. _ run (* self. ARGs, ** self. kwargs) # execute the passed function into T: Self. _ report_error (sys. exc_info () return self. _ report_result (result) # This is mainly used to execute the suspended callback finally: Self. _ dict __. pop ('_ run', none) self. _ dict __. pop ('args', none) self. _ dict __. pop ('wargs', none)

After the execution is complete, the _ report_result method will be called to execute all the callback functions attached to the coroutine. In this way, the callback suspended for the join operation will be executed here, so that the join method returns and continues execution, so that the implementation of the join method is clearer .. In fact, it is relatively simple... In addition, it is more important to run the callback suspended on this coroutine, such as the join callback, instead of running it in the current coroutine immediately, instead, a callback is suspended on the hub loop. The Code is as follows:
# This mainly aims to suspend the callback in the hub loop to execute all the currently suspended callbacks of the greenlet. # these do not execute these pending callbacks immediately on this greenlet, instead, execute the callback that is mounted to the loop, so that the current coroutine can be returned as soon as possible # And if the current coroutine runs these callbacks, there will be problems, because if the callback has another coroutine switch method, after the switch, you will no longer be able to return this coroutine and continue to run other callbacks # And execute these callbacks on the loop, that is, the hub, to run these callbacks, even if you switch to another coroutine, in the future, we will return to the Hub sooner or later to continue the execution, so we can ensure that the callback can be fully run .. Def _ report_result (self, result): Self. _ exception = none self. value = result if self. _ links and not self. _ notifier: Self. _ notifier = self. parent. loop. run_callback (self. _ policy_links)

As for the reason for such a huge weekly chapter, the above comment should be clear...


Now let's analyze the implementation of sleep. The Code is as follows:

# In fact, the main purpose of sleep is to switch the current execution and return to the Hub's main loop def sleep (seconds = 0, ref = true ): "Put the current greenlet to sleep for at least * seconds *. * seconds * may be specified as an integer, or a float if fractional seconds are desired. if * ref * is false, the greenlet running sleep () will not prevent gevent. wait () from exiting. "Hub = get_hub () # obtain the hub object loop = hub. loop # obtain the hub loop object if seconds <= 0: # if there is no time for waiter = waiter () # create a waiter object, it is mainly used to maintain the switching loop between greenlet and hub. run_callback (waiter. switch) # When a callback is suspended on the loop, it is actually to resume the execution of the greenlet of the current sleep in the loop. get () # The main function in this section is to record the current greenlet object, and then switch the stack to the hub to execute else: hub. wait (loop. timer (seconds, ref = ref) # Wait with timing

In fact, there are two types of timeout values: the timeout value passed in during sleep, which is smaller than or equal to 0 and greater than 0...

For sleep operations, if it is in a multi-threaded environment, such as Java sleep, it is actually blocking the current thread, so that JVM will schedule the running of other threads, and for gevent, in fact, it can be understood that the current coroutine voluntarily abandons CPU resources and then runs it later...

First, let's take a look at the principle that the timeout is less than or equal to zero. In fact, it is very simple to switch to the Hub coroutine execution and register a callback on the hub loop, switch back to the current coroutine for execution...


Note that the switch operation is not directly reflected in the code, but a waiter object is added... The callback registered on the loop is the waiter switch method, and then the waiter object get method is called...


You can see the gevent annotation here. The waiter object can be understood as a collaboration tool between gevent-encapsulated coroutines. The switchover between specific coroutines is done by waiter, avoid requiring user code to involve switch operations, because it is prone to errors... Let's take a look at the definition of WAITER:

# In fact, this object is only used to maintain the switching relationship between greenlet and hub. # The switch method of the current waiter object will be registered in the hub as the callback, then, the callback class waiter (object): "a low level communication utility for greenlets will be executed in the hub loop. wrapper around greenlet's ''switch () ''and ''throw ()'' CILS that makes them somewhat safer: * switching will occur only if the waiting greenlet is executing: meth: 'get' method currently; * any error raised in the greenlet is handled inside: Meth: 'Switch 'And: Meth: 'throw' * If: Meth: 'Switch '/: Meth: 'throw' is called before the caller CILS: Meth: 'get', then: Class: 'waiter' will store the value/exception. the following: Meth: 'get' will return the value/raise the exception. the: Meth: 'Switch 'And: Meth: 'throw' methods must only be called from the: Class: 'wheel' greenlet. the: Meth: 'get' method must be called from a greenlet other than: Class: 'wheel'. >>> result = waiter () >>> timer = get_hub (). loop. timer (0.1) >>> timer. start (result. switch, 'Hello from waiter ') >>> result. get () # Blocks for 0.1 seconds 'Hello from waiter 'If switch is called before the greenlet gets a chance to call: Meth: 'get' then: class: 'wafer' stores the value. >>> result = waiter () >>> timer = get_hub (). loop. timer (0.1) >>> timer. start (result. switch, 'Hi from waiter ') >>> sleep (0.2) >>> result. get () # returns immediatelly without blocking 'Hi from waiter '.. warning: This a limited and dangerous way to communicate between greenlets. it can easily leave a greenlet unscheduled forever if used incorrectly. consider using safer: Class: 'event'/: Class: 'asyncresresult'/: Class: 'queue 'classes. "" _ slots _ = ['hub ', 'greenlet', 'value',' _ exception'] def _ init _ (self, Hub = none): If Hub is none: Self. hub = get_hub () # Get the top-level hub object else: Self. hub = hub self. greenlet = none self. value = none self. _ exception = _ None def clear (Self): Self. greenlet = none self. value = none self. _ exception = _ None def _ STR _ (Self): If self. _ exception is _ None: Return '<% s greenlet = % S>' % (type (Self ). _ name __, self. greenlet) Elif self. _ exception is none: Return '<% s greenlet = % s value = % R>' % (type (Self ). _ name __, self. greenlet, self. value) else: Return '<% s greenlet = % s exc_info = % R>' % (type (Self ). _ name __, self. greenlet, self. exc_info) def ready (Self): "" Return true if and only if it holds a value or an exception "" return self. _ exception is not _ None def successful (Self): "" Return true if and only if it is ready and holds a value "" return self. _ exception is none @ property def exc_info (Self): "holds the exception info passed to: Meth: 'throw' if: Meth: 'throw' was called. otherwise ''none ''. "If self. _ exception is not _ None: return self. _ exception # Schedule greenlet execution. This method can only run def switch (self, value = none) in the hub loop: "" switch to the greenlet if one's available. otherwise store the value. "greenlet = self. greenlet if greenlet is none: Self. value = value self. _ exception = none else: # The switch method assert getcurrent () is self can only be called in the hub. hub, "can only use waiter. switch method from the hub greenlet "Switch = greenlet. switch try: Switch (value) # restore record greenlet execution failed T: Self. hub. handle_error (switch, * sys. exc_info () def switch_args (self, * ARGs): return self. switch (ARGs) Def throw (self, * throw_args): "" switch to the greenlet with the exception. if there's no greenlet, store the exception. "greenlet = self. greenlet if greenlet is none: Self. _ exception = throw_args else: assert getcurrent () is self. hub, "can only use waiter. switch method from the hub greenlet "Throw = greenlet. throw try: Throw (* throw_args) failed T: Self. hub. handle_error (throw, * sys. exc_info () # The main function of this operation is to record the greenlet def get (Self): "" if a value/An exception is stored, return/raise it. otherwise until switch () or throw () is called. "If self. _ exception is not _ None: If self. _ exception is none: return self. value else: getcurrent (). throw (* self. _ exception) else: assert self. greenlet is none, 'this waiter is already used by % R' % (self. greenlet,) self. greenlet = getcurrent () # record the current greenlet object. In the hub loop, the switch callback of the current waiter will be called and the greenlet execution try: return self will be restored. hub. switch () # Switch to the hub to run, so the greenlet's run here is temporarily interrupted, and the switch will continue to run finally: Self. greenlet = none def _ call _ (self, source): If source. exception is none: Self. switch (source. value) else: Self. throw (source. exception) # can also have a debugging version, that wraps the value in a tuple (self, value) in switch () # And unwraps it in wait () thus checking that switch () was indeed called

This code should be well understood, and the comments are all very clear... The more important thing is the get method. This method will save the currently executed coroutine and switch to the hub for execution. For the switch method, it will switch back to the execution of the starting coroutine ..


Well, the above introduces the implementation of sleep without timeout ..

Next let's take a look at the implementation of Timeout:

Hub. Wait (loop. Timer (seconds, ref = ref) # Wait with timing

Here we first create a timer object, which can be understood as registering a timeout on the loop, and then look at the Code:

# Register watcher on the loop and wait for Def wait (self, Watcher): Waiter = waiter () # first create a waiter object unique = Object () watcher. start (waiter. switch, unique) # When watcher times out, the waiter switch method will be called. Try: Result = waiter. get () # call the waiter get method, which is mainly used to switch the greenlet that currently calls sleep, and then switch to the Hub running assert result is unique, 'invalid switch into % s: % R (expected % R) '% (getcurrent (), result, unique) Finally: Watcher. stop ()

The waiter object and its get method are still created. However, it should be noted that the waiter switch callback is registered on the newly created timer object, instead of registering it directly to the loop, the callback will be called when timer times out to resume the execution of sleep's coroutine ..


Okay. Here, gevent's general coroutine and switching relationships are almost the same ....

Gevent coroutine implementation principle

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.