A detailed explanation of tornado coroutine principle, tornadocoroutine

Source: Internet
Author: User

A detailed explanation of tornado coroutine principle, tornadocoroutine

How does coroutine work in tornado?

Coroutine Definition
Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations .. -- [Wikipedia]

In normal programming, we are more accustomed to using subroutine, which is commonly called a function or process. Child routines usually have only one entry (function call, real parameters are passed to the form parameter to start execution), and one exit (function return, execution completed, or an exception is thrown, and control is transferred to the caller ). However, coroutine is a computer program module (Child routines can be seen as exceptions of coroutine) based on child routines. It can have multiple entry points and allow one entry point, the execution is paused before the next entry point, the execution status is saved, and the execution status is resumed at the right time. The execution starts again from the next entry point, which is also the capability of the coroutine.

Definition

Coroutine code block

Code in one entry point and the next entry point (or exit point.

Coroutine Module

Consists of n entry point codes and n coprocessor code blocks. The first entry point is usually a function entry point. Its organizational form is as follows: function entry point-> coroutine code block-> entry point-> coroutine code block ..., The entry points and code blocks are separated.

Linear Module

The function body of a synchronous function is linearly executed. That is to say, each line of code in a module is executed successively. If the execution of a module is not completed, the code of other modules will not be executed. This code module is called a linear module.

If a coroutine module only contains a single entry point and a single coroutine code block (assuming that the coroutine code block is full of Synchronous Code), of course this coroutine module is a linear execution module, however, if there are multiple entry points and multiple coprocessor code blocks, it is not a linear module. The process of executing a coroutine module is actually scattered (different coroutine code blocks and coroutine code blocks are executed in different time periods, which are not mutually exclusive ), but it is also ordered (the last coroutine code block is executed only after the execution of the previous coroutine code block is completed ). The intermediate time gap between two successive coroutine code blocks that belong to the same coroutine module. There may be many coroutine code snippets of other coroutine modules being executed.

Generator and yield Semantics

When talking about coroutine, we must talk about the generator in the python semantics ).

In pep255, we mentioned the implementation of "simple generator" and "yield statement" (not "yield Expression" at this time. A basic idea provides a function that can return intermediate results to the caller and maintain the local state of the function so that the function can be resumed after it leaves.

Prep255 provides a simple example to generate the Fibonacci series:

 def fib():   a, b = 0, 1   while 1:    yield b    a, b = b, a+b

A and B are initialized as 0, 1. When yield B is executed, 1 is returned to the caller. When fib resumes execution, a becomes 1, B is also 1, and 1 is returned to the caller. Generator is a natural programming method, because for fib, its functions remain unchanged and the next Fibonacci number is continuously generated. For fib callers, fib is like a list iterator that continuously iterates to obtain the next Fibonacci number.

def caller():  for num in fib():    print num

A generator is a function that contains the yield expression. This function is called a generator. A generator is always asynchronous, even if the generator module contains blocking code. Because a generator is called, the parameters of the generator are bound to the generator, and the result is returned as a generator object. Its type is types. GeneratorType, and the code in the main module of the generator is not executed.

Each time you call the next method of a GeneratorType object, the generator function is executed to the next yield statement, or a return statement is encountered, or the execution ends at the end of the generator function.

In pep342, the Generator is further enhanced, and the send method and yield Expression semantics of GeneratorType are added. Yield Expression, which can be used as the expression on the right of the equal sign. If you call the send (None) method for Generator, the Generator function will continue to execute the yield expression from the beginning. The next time you call send (argument) to Generator, Generator resumes execution. The argument can be obtained in the generator function, which will be returned as the yield expression.

As you can see above, Generator already has some capabilities of coroutine. For example, the execution can be paused, saved, resumed, and asynchronously executed.

However, at this time, Generator is not a coroutine. A real coroutine can control when the code will be executed. When a Generator runs a yield expression or statement, the execution control is transferred to the caller.

However, it is still possible to implement coroutines on top of a generator facility, with the aid of a top-level dispatcher routine (a trampoline, essential) that passes control explicitly to child generators identified by tokens passed back from the generators. -- [Wikipedia]

As mentioned in Wikipedia, a top-level scheduling subroutine can be implemented to transfer the execution control back to Generator so that it can continue execution. In tornado, ioLoop is such a top-level scheduling sub-routine. Each coroutine module communicates with ioLoop through the function decorator coroutine, so that ioLoop can be paused in the coroutine module, re-schedule the coroutine module execution at the right time.

However, we cannot introduce coroutine and ioLoop. Before introducing the two, we must first understand a very important kind of Future in tornado's coroutine environment.

Future Type

The Future class is located in the concurrent module of tornado source code. For the complete code of the Future class, see tornado source code. Part of the code is used for analysis.

class Future(object):  def done(self):    return self._done  def result(self, timeout=None):    self._clear_tb_log()    if self._result is not None:      return self._result    if self._exc_info is not None:      raise_exc_info(self._exc_info)    self._check_done()    return self._result  def add_done_callback(self, fn):    if self._done:      fn(self)    else:      self._callbacks.append(fn)  def set_result(self, result):    self._result = result    self._set_done()  def _set_done(self):    self._done = True    for cb in self._callbacks:      try:        cb(self)      except Exception:        app_log.exception('exception calling callback %r for %r',                 cb, self)    self._callbacks = None

Important Future member functions:

def done(self):

Whether the _ result Member of Future is set

def result(self, timeout=None):

Obtain the result of the Future object

def add_done_callback(self, fn):

Add a callback function fn to the Future object. If the Future object has been done, execute fn directly. Otherwise, add fn to a member list of the Future class and save it.

def _set_done(self):

An internal function mainly traverses the list and calls the callback function in the list one by one, that is, the previously added add_done_calback function.

Def set_result (self, result ):

Set the result for the Future object and call _ set_done. That is to say, after the Future object obtains the result, all the callback functions added to add_done_callback will be executed.

Future encapsulates the results of asynchronous operations. Actually, it is similar to the placeholder for asynchronous loading of images in the html front-end of a webpage, but it is also a complete image after loading. The Future is also useful. tornado uses it and eventually wants it to be set_result and calls some callback functions. The Future object is actually the communication between the coroutine function decorator and IOLoop, and plays a very important role.

IOLoop class

The underlying core class of the tornado framework is located in the ioloop module of tornado. Similar to the message loop in win32 window. Each window can be bound to a window process. A message loop is executed during the window process. The main task of a message loop is to use the PeekMessage system to extract various types of messages from the message queue, determine the type of the message, and then hand it to the specific message handler for execution.

IOLoop in tornado has a great similarity with this. It acts as the coroutine scheduler in the coroutine running environment. In essence, it is an event loop and waits for events, then run the corresponding event processor (handler ). However, IOLoop mainly schedules IO events (such as read, write, and error ). Besides, callback and timeout events can be scheduled.

In this blog, we only focus on the callback event for the time being, because this is the most relevant to the coroutine scheduling.

def add_future(self, future, callback):  assert is_future(future)  callback = stack_context.wrap(callback)  future.add_done_callback(    lambda future: self.add_callback(callback, future))

The add_future function is implemented in the base class IOLoop. The function parameter is a Future object and a callback function. When the Future object is set_result, a callback function is executed, which is a lambda function that calls the add_callback function of IOLoop. Add the add_future parameter callback to the unified scheduling of IOLoop, so that the callback can be executed in the next iteration of IOLoop.

def add_callback(self, callback, *args, **kwargs):  with self._callback_lock:    if self._closing:      raise RuntimeError("IOLoop is closing")    list_empty = not self._callbacks    self._callbacks.append(functools.partial(      stack_context.wrap(callback), *args, **kwargs))    if list_empty and thread.get_ident() != self._thread_ident:      self._waker.wake()

The add_callback function is mainly implemented in PollIOLoop, a subclass of IOLoop. It is easy to understand.

Bind the passed callback function and the partial function to the generated partial function, in fact, it is to find a place to save the parameters required for the callback operation. Add the encapsulated partial function to the callback function list. When IOLoop runs in the next iteration, it traverses the callback function list and runs partial functions without passing in parameters for execution. The result is equivalent to running callback with real parameters.

The IOLoop object calls the start function and runs event loop. In event loop, traverse the callback list, execute the callback function, traverse the timeout list, and execute timeoutCallback. Finally, execute ioHandler.

Coroutine function decorator

The function modifier is essentially a function. We call this function a decoration function. The decorator function signature contains a function object (callable object) parameter. The returned result is a new function object defined in the decorator. If the returned function object is called, the parameters of the decorator function (function object) are also called. However, some things will be done before this parameter (the modifier function parameter) is called, or after this parameter is called, some things will be done. In fact, these operations are some of the decoration (additional operations) of parameters (original functions) by using internal custom function objects)

When a function is decorated by a decorator. When you call this function (this function is already another function), you actually call the internal function object returned by the decorator function. Understanding how to execute the coroutine-modified function in tornado is mainly to understand what the new function object defined inside the utine function does.

def coroutine(func, replace_callback=True):  return _make_coroutine_wrapper(func, replace_callback=True)

class Runner(object):  def __init__(self, gen, result_future, first_yielded):    self.gen = gen    self.result_future = result_future    self.future = _null_future    self.yield_point = None    self.pending_callbacks = None    self.results = None    self.running = False    self.finished = False    self.had_exception = False    self.io_loop = IOLoop.current()    self.stack_context_deactivate = None    if self.handle_yield(first_yielded):      self.run()  def run(self):    if self.running or self.finished:      return    try:      self.running = True      while True:        future = self.future        if not future.done():          return        self.future = None        try:          try:            value = future.result()          except Exception:            self.had_exception = True            yielded = self.gen.throw(*sys.exc_info())          else:            yielded = self.gen.send(value)        except (StopIteration, Return) as e:          self.finished = True          self.future = _null_future          self.result_future.set_result(getattr(e, 'value', None))          self.result_future = None          return        except Exception:          self.finished = True          self.future = _null_future          self.result_future.set_exc_info(sys.exc_info())          self.result_future = None          return        if not self.handle_yield(yielded):          return    finally:      self.running = False  def handle_yield(self, yielded):    try:      self.future = convert_yielded(yielded)    except BadYieldError:      self.future = TracebackFuture()      self.future.set_exc_info(sys.exc_info())    if not self.future.done() or self.future is moment:      self.io_loop.add_future(        self.future, lambda f: self.run())      return False    return True

The above code has actually made some adjustments to the source code. However, when the function call enters the Runner constructor, that is to say, the first execution of the Generator has been completed. Next, handle_yield is called to process the returned results of the first Generator execution. Of course, the returned results may be of multiple types. It may be a Future object, list, dict, or another type object, or a common type. Convert_yield and self. future Save the reference of a Future object (the result returned by the first Generator execution ). At this time, if self. future has not been set_result. Bind a done_callback (lambda f: self. run () to self. io_loop.

As mentioned above. In the add_future function of ioloop, the parameter callback is added to IOLoop for scheduling only when set_result is called in the future parameter. In other words. In the Runner class, self. run will wait until self. future Is set_result in a code block before IOLoop can execute it in the next iteration, so that the scheduling coroutine can resume execution. In the self. run function, we can see that the next coroutine code block will be restored through the send function of Generator. So the key issue is that we need to understand the self. future in the Runner class, when will it be set_result?

Here we can see the important role of the Future class. The function of future. set_result is as follows:

Send a signal to inform IOLoop to continue the scheduling and suspension of the coroutine.

We can use the following code example to understand how the whole process of coroutine scheduling is implemented.

import tornado.ioloopfrom tornado.gen import coroutinefrom tornado.concurrent import Future@coroutinedef asyn_sum(a, b):  print("begin calculate:sum %d+%d"%(a,b))  future = Future()  def callback(a, b):    print("calculating the sum of %d+%d:"%(a,b))    future.set_result(a+b)  tornado.ioloop.IOLoop.instance().add_callback(callback, a, b)  result = yield future  print("after yielded")  print("the %d+%d=%d"%(a, b, result))def main():  asyn_sum(2,3)  tornado.ioloop.IOLoop.instance().start()if __name__ == "__main__":  main()

The actual running scenario is: when a coroutine (asyn_sum) encounters a yield expression being suspended for execution, IOLoop calls another code segment (callback in asyn_sum) for execution, while in callback, you can access the future object in the suspended coroutine (asyn_sum) (that is, self in the Runner object. future reference). In callback, future will call set_result, so the paused coroutine (asyn_sum) will be resumed for execution in the next iteration of IOLoop scheduling callback function.

Summary

The coroutine in tornado is implemented based on the python language Generator and combined with a global scheduler IOLoop. Generator communicates with IOLoop through the function annotator coroutine. IOLoop does not have direct control capability, and the scheduling resume is paused. The future object is yield in the coroutine. When the coroutine is paused, IOLoop schedules the execution of another code module. In this Code module, you can access this future object and set its set_result. The result is indirectly restored through IOLoop. Different Execution Code modules share future objects and cooperate with each other to ensure smooth coroutine scheduling.

In this sense, the future object, such as the function of the Event Kernel Object in window. The event in window is used for synchronization in the thread. In the coroutine, yield future is equivalent to WaitForSingleObject (event_object), and future. set_result (result ). It is equivalent to SetEvent (event_object ). The difference between future and Event is that the coroutine uses future to resume execution, while the thread uses Event to synchronize threads.

The above is all about the detailed explanation of the tornado coroutine principle. I hope it will help you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.