[Gevent source code analysis] gevent two carriages-libev and greenlet

Source: Internet
Author: User

[Gevent source code analysis] gevent two carriages-libev and greenlet

This article will discuss how the two carriages of gevent-libev and greenlet work together.

The gevent event driver uses libev at the underlying layer. Let's take a look at how to use the event loop in gevent independently.

# Coding = utf8import socketimport geventfrom gevent. core import loopdef f (): s, address = sock. accept () print address s. send ("hello world \ r \ n") loop = loop () sock = socket. socket (socket. AF_INET, socket. SOCK_STREAM) sock. bind ("localhost", 8000) sock. listen (10) io = loop. io (sock. fileno (), 1) #1 represents readio. start (f) loop. run ()
The code is very simple. Use core. loop creates a new loop instance, adds a socket read event through io, sets a callback through start, and then runs to start the event loop. A simple helloworld server has been set up, you can telnet localhost 8000 to view the response results.

The entire gevent cycle is started in hub. run,

    def run(self):        assert self is getcurrent(), 'Do not call Hub.run() directly'        while True:            loop = self.loop            loop.error_handler = self            try:                loop.run()            finally:                loop.error_handler = None  # break the refcount cycle            self.parent.throw(LoopExit('This operation would block forever'))
The self. loop above is the same as the loop object we created above. Next we will use the recv function of socket to see how time is registered in the loop.
The socket object of gevent is reencapsulated by gevent. The original socket is the following self. _ sock
Let's take a look at the operations performed by the socket recv of gevent.

Gevent/socket. py

Def recv (self, * args): sock = self. _ sock # keeping the reference so that fd is not closed during waiting while True: try: return sock. recv (* args) #1. if data already exists in the socket at this time, the direct return failed T error: # No data will throw an exception, and the errno is EWOULDBLOCK ex = sys. exc_info () [1] if ex. args [0]! = EWOULDBLOCK or self. timeout = 0.0: raise # QQQ without clearing exc_info test _ refcount. test_clean_exit fails sys. exc_clear () # Add the read event of the file descriptor to the loop self. _ wait (self. _ read_event) "self. _ wait will call hub. wait, def wait (self, watcher): waiter = Waiter () unique = object () watcher. start (waiter. switch, unique) # The watcher is the loop mentioned above. io () instance, waiter. switch is the callback function try: result = waiter. get () assert result is unique, 'invalid switch into % s: % r (expected % r) '% (getcurrent (), result, unique) finally: watcher. stop () when the loop captures a "readable event", it calls back waiter. switch method. Now it will return here (because of the while LOOP) to continue to execute sock. recv (* args) Generally, data can be read when recv is re-performed, and "is returned directly """
In the preceding self. _ read_event = io (fileno, 1), return to the while loop again and return the result of sock. recv directly. We know that socke. recv (1024) may return no 1024 bytes, depending on the number of bytes accepted by the buffer at this time. Therefore, the data may not be read at a time, so it may trigger multiple times.

EWOULDBLOCK: reads data multiple times. Read ends only when recv is null. The typical reading of the entire data is generally as follows:

    buff = []    while 1:        s = socket.recv(1024)        if not s:            break        else:            buff.append(s)    buff = "".jon(buff)
You may be a little curious. assert is used in multiple gevents to determine the return value of waiter, such as hub. wait.

Class Hub (greenlet): def wait (self, watcher): waiter = Waiter () unique = object () watcher. start (waiter. switch, unique) try: result = waiter. get () assert result is unique, 'invalid switch into % s: % r (expected % r) '% (getcurrent (), result, unique) # Why is assert required here? # Because it is normal, loop calls waiter. switch (unique), then waiter. get () must obtain unique. # If it is not unique, waiter must be called elsewhere. switch, which is not normal finally: watcher. stop ()

This is mainly to prevent the callback function from being called by other greenlet, because greenlet uses the switch to pass parameters. See the following code:

Def f (t): gevent. sleep (t) p = gevent. spawn (f, 2) gevent. sleep (0) # after 2 s, libev will call back f, so the following p. get gets 2 switcher = gevent. spawn (p. switch, 'Hello') # callback p. switch, pass the helloresult parameter = p. get ()
The following exception is returned:

The following exception will be reported: AssertionError: Invalid switch
 
  
: 'Hello' (expected
  )
   
    
Failed with AssertionError
    

Let's take a look at greenlet encapsulated by gevent,

class Greenlet(greenlet):    """A light-weight cooperatively-scheduled execution unit."""    def __init__(self, run=None, *args, **kwargs):        hub = get_hub()        greenlet.__init__(self, parent=hub)        if run is not None:            self._run = run
We can see that all Greenlet parents are hub. What are the advantages of this?
Because when a greenlet dies, it will return to the parent greenlet, that is, the hub, and the hub will continue to start the event loop from where the last callback was run, this is why the event loop runs in the hub.

Let's look at the life cycle of a Greenlet.

To start Greenlet, you must call the start () method,

    def start(self):        """Schedule the greenlet to run in this loop iteration"""        if self._start_event is None:            self._start_event = self.parent.loop.run_callback(self.switch)
That is, add the current switch to the loop event loop. When the loop calls back self. switch, the run method is run (provided by the underlying greenlet ),

We can provide the _ run method for inheritance.

Def run (self): try: if self. _ start_event is None: self. _ start_event = _ dummy_event else: self. _ start_event.stop () # cancel the previously added callback function. loop will remove the function from the callback chain. # Libev provides a series of object encapsulation, such as io and timer, which all have the start and stop methods # While callback is implemented through loop. run_callback is enabled, which is different from others. try: result = self. _ run (* self. args, ** self. kwargs) # run the custom _ run method counter T: self. _ report_error (sys. exc_info () return self. _ report_result (result) # Set the returned result. This is an important method. Let's take a look at finally: pass.
If there is no exception, the _ report_result method will be called. Let's take a look:
Def _ report_result (self, result): self. _ exception = None self. value = result # Set the returned result, which can be obtained through get (). Note that to obtain the value, # Do not directly pass. value, must use the get method, because get () will get the real running result, # And. value, that is, the Greenlet may not end if self. _ links and not self. _ notifier: # What is this? Self. _ notifier = self. parent. loop. run_callback (self. _ policy_links)
Why do we have to get the final returned result through get ()? Because get () is equivalent to an asynchronous result, it is very likely that we will call Greenlet if it has no results.
Get () cannot obtain the result if it is not asynchronous. Let's take a look at the get () operation,

Def get (self, block = True, timeout = None): "" Return the result the greenlet has returned or re-raise the exception it has raised. if block is ''false'', raise: class: 'gevent. timeout' if the greenlet is still alive. if block is ''true', unschedule the current greenlet until the result is available or the timeout expires. in the latter case,: class: 'gevent. timeout' is raised. "if self. ready (): # This G The reenlet is running and returns the result if self. successful (): return self. value else: raise self. _ exception if block: # It indicates that the Greenlet has not ended switch = getcurrent (). switch self. rawlink (switch) # change the current Greenlet. add the switch to your callback chain "" self. _ links. append (callback) "" try: t = Timeout. start_new (timeout) try: result = self. parent. switch () # switch to the hub. It can be understood that the current get () is blocked, when you call back the newly registered switch, it will be back here # The problem is that we didn't register the switch to the hub. Who calls back the switch? # The behind-the-scenes hacker is actually the above _ report_result. When Greenlet ends, _ report_result is called, # And _ report_result registers _ policy_links to the callback of the loop, finally, the newly registered switch # def _ policy_links (self) is called back by _ policy_links: # while self. _ links: # link = self. _ links. popleft () # try: # link (self) # here, we can see that self is passed to the switch, so the result is self (greenlet passes the result through the switch) # Skip t: # self. parent. handle_error (link, self), * sys. exc_info () assert result is self, 'invalidswitch into Greenlet. get (): % R' % (result,) # Find out why result is self. finally: t. cancel () handle T: self. unlink (switch) raise # Run here. In fact, Greenlet is over, in other words, self. ready () must be True if self. ready (): if self. successful (): return self. value else: raise self. _ exception else: # It's not over yet. You don't have to wait. No value is returned. You can only throw an exception raise Timeout.
Through the above, we know that get () is actually the way to return results asynchronously. When Greenelt is about to end, it will return through the _ report_result at the end of the run () function, SO _ report_result is very important.

In fact, Greenlet also provides a switch_out method. In gevent, switch_out is a concept corresponding to switch. When you switch to Greenlet

Call the switch method. When you switch to the hub, the Greenlet switch_out method is called, that is, to save and restore Greenlet.

In gevent, backdoor. py (a backdoor that provides a python interpreter) uses a switch. Let's take a look.

class SocketConsole(Greenlet):    def switch(self, *args, **kw):        self.saved = sys.stdin, sys.stderr, sys.stdout        sys.stdin = sys.stdout = sys.stderr = self.desc        Greenlet.switch(self, *args, **kw)    def switch_out(self):        sys.stdin, sys.stderr, sys.stdout = self.saved

Switch_out is very nice, because the switch environment requires sys. stdin, sys. stdout, sys. stderr, so when switching to Greenlet,

Replace these three variables with our own socket descriptor, but when you want to switch to the hub, You need to restore these three variables. Therefore, save them in switch and recover them in switch_out, when switch_out is switched to the hub, it is called with the switch of the hub:

Class Hub (Greenlet): def switch (self): # We can see that the previous Greenlet is called first. switch_out = getattr (getcurrent (), 'switch _ out', None) if switch_out is not None: switch_out () return greenlet. switch (self)
You can use the following two sentences to start a python backdoor interpreter. Interested shoes can be used for fun.
from gevent.backdoor import BackdoorServerBackdoorServer(('127.0.0.1', 9000)).serve_forever()
Via telnet, you can do whatever you want.

In gevent, basically every function has a timeout parameter, which is mainly implemented through the timer of libev.

Use:

The Timeout object has the pending attribute to determine whether it is not running.

T = Timeout (1) t. start () try: print 'aaa' import time assert t. pending = True time. sleep (2) gevent. sleep (0.1) # note that sleep (0) cannot be used here. Although sleep (0) is switched to the hub, the timer is reached, but the callback registered by gevent # Is a callback with a higher priority than the timer (call callback first in the libev event loop, and then timer) when t Timeout, e: assert t. pending = False assert e is t # determine whether it is my timer. It is consistent with the assert above to prevent the hub from calling t. switch print sys. exc_info () finally: # cancel the timer, whether or not the timer is available, you can cancel t. cancel ()
The Timout object also provides with context support:

with Timeout(1) as t:    assert t.pending    gevent.sleep(0.5)assert not t.pending
The second parameter of Timeout can be customized. If it is Fasle, The with context will not pass an exception.
With Timeout (1, False) as t: assert t. pending gevent. sleep (2) assert not sys. exc_info () [1] We can see that no exception is thrown.

There is also a with_timeout shortcut:

Def f (): import time. sleep (2) gevent. sleep (0.1) # gevent cannot be used. sleep (0) print 'fff' t = with_timeout (1, f, timeout_value = 10) assert t = 10
Note that if with_timeout must have the timeout_value parameter, a Timeout exception is not thrown.


At this point, we should be familiar with the bottom layer of gevnet. What we haven't introduced to gevent is some high-level things, such as Event and Pool, which will be taken out separately in the future.

Let's talk about it. I think we still need to pay attention to the use of libev, but this requires in-depth analysis of core. the libev cython extension of pyx requires cython knowledge. Recently, I have been reading the source code and will share it with you later.

Why do we need to analyze libev extensions? It mainly involves some scheduled tasks in the game. The current implementation of gevent is rather poor. In fact, the timer provided by libev has two parameters: after, repeat, after is how long it will take to start the timer, repeat is started again after multiple times, which just meets my needs,

The following is a simple scheduled task script that I wrote. It is started through gfirefly and provides web interfaces.

# Coding: UTF-8 ''' Created on @ author: http://blog.csdn.net/yueguanghaidao'''import tracebackimport datetimefrom flask import requestfrom gevent. hub import get_hubfrom gtwisted. utils import logfrom gfirefly. server. globalobject import webserviceHandlefrom app. models. role import Role ''' the scheduled Task Name (running time (0-24). The Unit of each interval is hour. The callback function is do_name ''' CRONTAB = {"energy ": (0, 1), # restore "god_surplustime": (0, 24), "arena_surplustime": (22, 24), "arena_rewrad": (21, 24 ), "sign_reset": (1, 24)} def log_0000t (fun): def wrapper (* args): try: log. msg (fun. _ name _) return fun (args) handle T: log. msg (traceback. format_exc () return wrapperclass Task (object): "All scheduled tasks" @ classmethod @ log_timer t def do_energy (cls ): "" 1 more physical strength per hour (less than 8 physical strength) "" Role. objects (energy _ lt = 8 ). update (inc _ energy = 1) @ classmethod @ log_effect def do_god_surplustime (cls): "God of Wealth" Role. objects (god _ exists = True ). update (set _ god _ surplustime = 10) @ webserviceHandle ("/cron", methods = ['get', 'post']) def cron (): "provides web interface calls" "action = request. args. get ("action") if not action: return "action:

"+"
". Join (a for a in CRONTAB) else: try: f = getattr (Task, "do _" + action) try: f () handle T: return traceback. format_exc () return "success" failed t AttributeError: return "action:

"+"
". Join (a for a in CRONTAB) def timer (after, repeat): return get_hub (). loop. timer (after, repeat) def run (): log. msg ("cron start") # configure mongodb container config. init_Mongo () for action, t in CRONTAB. items (): log. msg ("% s start" % action) f = getattr (Task, "do _" + action) now = datetime. datetime. now () other = now. replace (hour = t [0], minute = 0, second = 0) if other> now: after = (other-now ). seconds else: after = 24*3600-(now-other ). seconds # after = t [0] * 3600 timer (after, t [1] * 3600 ). start (f) run ()



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.