Analysis of processing process of Aiohttp request in Python

Source: Internet
Author: User
Tags constructor http request python decorator in python

Students who have used Python's aiohttp third-party libraries know that using aiohttp to construct the simplest Web server is a very easy task, with just a few lines of code to fix:

From aiohttp Import web
Import Asyncio

def index (Request):
Return web. Response (body=b '
Async def init (loop):
App = web. Application (Loop=loop)
App.router.add_route (' Get ', '/index ', index)
Server = await Loop.create_server (App.make_handler (), ' 127.0.0.1 ', 8000)
Return server

def main ():
loop = Asyncio.get_event_loop ()
Loop.run_until_complete (init (loop))
Loop.run_forever ()

if __name__ = = ' __main__ ':
Main ()

This will enable us to implement a simple Web server ...

Run this Python file, open the browser, and enter Http://127.0.0.1:8000/index in the address bar so you can see Hello world. Isn't it amazing? So some students here will be puzzled, when the user in the browser input http://127.0.0.1:8000/index, the server is how to locate the request to our URL Processing function index (request) in it? From the code point of view, you can definitely judge, because there are

App.router.add_route (' Get ', '/index ', index)

The reason for this line of code is that the server is aware that your request requests (Method:get Path:/index) need to be handled by the index (request) function. So what does the inside of the line code do? How does the server respond to a request? Let's turn on one step and follow the steps of the server step-by-step to see what happens.

We first look at how the server received the request, several dozen breakpoints are not difficult to find, when the request came in, the server will eventually enter into the Aiohttp server.py module of the Serverhttpprotocol class in the Start () function:

@asyncio. coroutine
def start (self):
"" "Start processing of incoming requests.

It reads request line, request headers and request payload, then
Calls Handle_request () method. Subclass has to override
Handle_request (). Start () Handles various exceptions in request
or response handling. Connection is being closed always unless
Keep_alive (True) specified.
"""
# look at the function annotation first, after the following code says

From the source comment, this function is where the server starts processing the request.
To continue parsing the code for the Start () function:

......


@asyncio. coroutine


def start (self):


.......


While True:


message = None


Self._keep_alive = False


Self._request_count + 1


Self._reading_request = False





Payload = None


Try


# Read HTTP request method


# ....


# omit several lines in the middle ...


# ....


Yield from self.handle_request (message, payload)


# ....





We saw that in the last sentence of this code, the request was given to the handle_request () function, and if you look for the handler_request () function in the Serverhttpprotocol class, you will find that It's not a coroutine function, what the hell is going on? We step into here, and then F7 into the function, and it turns out that this is not the function in the Serverhttpprotocol class, but the Handler_request () function in the RequestHandler class in web.py, The original RequestHandler class is inherited from the Serverhttpprotocol class, which is covered with the hander_request () function and is decorated with @asyncio. Coroutine, let's look at its code:

@asyncio. coroutine


def handle_request (self, Message, payload):


If Self.access_log:


now = Self._loop.time ()


App = Self._app


# The Request object was actually constructed here


Request = Web_reqrep. Request (


App, message, payload,


Self.transport, Self.reader, Self.writer,


Secure_proxy_ssl_header=self._secure_proxy_ssl_header)


Self._meth = Request.method


Self._path = Request.path


Try


# It can be found that here the Match_info is obtained through the self._router.resolve (request) function.


Match_info = yield from self._router.resolve (request)


# The Match_info must be an object of Abstractmatchinfo type


Assert Isinstance (Match_info, abstractmatchinfo), Match_info


RESP = None


Request._match_info = Match_info


......


If RESP is None:


Handler = Match_info.handler # will this handler be the final handler for our request?


For factory in Reversed (self._middlewares):


Handler = yield from factory (app, Handler)


# The point is, it seems to be waiting for our URL processing function to process the results AH


RESP = yield from handler (request)


Except


......


# The function of the following two sentences is to send the results of the return to the client, the specific implementation of the process is more complex, the blogger is generally looked at, did not do detailed thinking. I won't say it here.


Resp_msg = yield from resp.prepare (request)


Yield from resp.write_eof ()


......





With the comments in the code above, we have a general idea of several key points:

What exactly is this match_info, and how did it get, and what attributes are included in it?
What is handler and how is it obtained?
Handler (request) looks like the final processing function of our request, what is its execution process?
Knowing the above three points, basically the whole request process is probably understood, and we look at it step-by-step.

First look at the 1th, how did Match_info come from?

Or look at the code, we enter into the Self._route.resolve (request) Source:

@asyncio. coroutine


def resolve (self, request):


Path = Request.raw_path


method = Request.method


Allowed_methods = set ()


# Please note this is for loop


For resource in Self._resources:


Match_dict, allowed = yield from resource.resolve (method, Path)


If Match_dict is not None:


Return match_dict


Else


Allowed_methods |= Allowed


Else


If Allowed_methods:


Return Matchinfoerror (httpmethodnotallowed (method,allowed_methods))


Else


Return Matchinfoerror (Httpnotfound ())





There is not much code, and path and method in the code above are the URLs and method (for example:/index and get) of the clients that are encapsulated in the request object, noting that line 9th, return a Match_dict object stating that there is no difference Wrong, the correct return of the result is this match_dict. What about Match_dict? See Match_dict through the Resource.resolve (method, path) function, we do not hurry to look at the internal implementation of this function, we first look at what type of resource, so it must not be seen, the only thing to know is that it is self._ Resource (it is a list) element, we open the debugger, perform this step to see that the elements stored in Self._resource are resourceroute types of objects, this resourceroute we do not elaborate, Only know that it has a member function of resolve ():

@asyncio. coroutine


def resolve (self, Method, path):


Allowed_methods = set ()


match_dict = Self._match (path)


If Match_dict is None:


Return None, Allowed_methods


For route in Self._routes:


Route_method = Route.method


Allowed_methods.add (Route_method)


If Route_method = = method or Route_method = = HDRs. Meth_any:


# The return statement here is the normal result


Return Urlmappingmatchinfo (match_dict, route), Allowed_methods


Else


Return None, Allowed_methods





We found that the match_dict originally is a Urlmappingmatchinfo object, but, careful classmate can find, this function also has a Match_dict object, here match_dict is Self._match The return result of (path), let's look at the process of Self._match (path) and look at the debugging information and see that self is the Plainresource class, and his _match () method looks like this:

def _match (self, Path):
# string comparison is about the times faster than regexp matching
if Self._path = = Path:
return {}
Else
Return None

The code is very concise, which is to compare the incoming path (for example,/index) with the _path property of an instance of the Plainresource class, if the equality returns an empty dictionary, otherwise return None, I think this return result is an empty dictionary, That his role at the upper level the call should be used as a condition of judgment for an if statement, and it is true. If, what is the plainresource here, I'm here to tell you that this is an object that you instantiate when you add a route to the server when you initialize the server, and it exists as an attribute of the app, but you have to pay attention to it and talk about it later.

OK, let's go back to the resolve (the self, Method, path) function again (note that there are two resolve functions, I distinguish them by argument), check for none after getting match_dict, and if none, description req Uest's path does not match in the app's route, it returns none directly, continuing through the next resource object in the upper resolve (self, request) function and then matching (Balabala ...). )。
If the match_dict is not none, it means that the path in the resource object matches the path in the request, and then:

For route in Self._routes:
    route_method = Route.method
    allowed_ Methods.add (Route_method)
    if Route_method = = method or Route_method = = HDRs. Meth_any:
        # The return statement here is the normal result of returning
         return Urlmappingmatchinfo (match_dict, route), Allowed_methods
 
This operation is checked when path matches method, if this resource method is the same as the method of the request, or the resource method is "*" (the asterisk matches all the method), then return a URLMAPPINGM The Atchinfo object, constructed with Match_dict and Route,route, is an object of Resourceroute type that encapsulates the Plainresource object, which is the type of object. That is, the Urlmappingmatchinfo object that is returned now encapsulates the Plainresource object that exactly matches the path and method of the request. A little messy ah, is not, only blame Bo Master level limited ...

So now for a reason, where does this urlmappingmatchinfo go, looking back at the contents and finding out that it's the Match_dict object of the Resolve (self, request) function, and remember that this object is still in the For loop , the match_dict gets the return value, it is judged if none, and if it is none, continue to match the next Plainresource (the following will say how this plainresource came, not urgently), if not None, directly back to Match_ Dict (is a Urlmappingmatchinfo object), who did this match_dict return to? No hurry, then turn over, found to return to the handler_request (self, message, payload) function of the match_info, look back at the Handler_request () code, request Match_info is Abst Ractmatchinfo type, in fact, is not contradictory, because the Urlmappingmatchinfo class is inherited from the Abstractmatchinfo class.

Well, now that the first question is clear, we know what Match_info is, where it comes from, and what it encapsulates.

Now let's see what handler is:

We continue to look at handler_request (self, Message, payload):

# This is where the returned match_info is encapsulated in the request object for later use, regardless of his
Request._match_info = Match_info
......  # ellipsis is omitted part of the Focused code
If RESP is None:
    # Here we get handler, see what it is
    handler = Match_info. Handler
    for factory in Reversed (self._middlewares):
         Handler = yield from factory (app., handler)
    resp = yield from handler (request)
 
Finally returned to our handler, you can see, handler is actually a Match_info attribute, but we look at the debugging information found that Match_info did not handler this property, because the debugging window can display are not function properties, In Python, a function is also one of the attributes of an object, and the handler here is just a function, so the returned handler can be a callable object. Gossip does not say much, our goal is to find out what handler is, in order to find out what Match_info.handler is, we enter the Abstractmatchinfo class inside look:

Class Abstractmatchinfo (Metaclass=abcmeta):
......
@asyncio. Coroutine # Pragma:no Branch
@abstractmethod
def handler (self, request):
"" "Execute matched Request Handler" ""
......

Obviously, handler is an abstract method, and its implementation should be in its subclasses, so let's look at the Urlmappingmatchinfo class:

Class Urlmappingmatchinfo (Dict, Abstractmatchinfo):
......
@property
def handler (self):
Return Self._route.handler
......

The original handler () function returned is Urlmappingmatchinfo Self._route.handler, this _route is what? Do not know to see debugging information AH ~, read the debugging information, the original _route is a resourceroute type of object:
Debugging windows
Careful schoolmate will discover, even if is _route, also still did not see Hanler, explained handler in Resourceroute class also is a function. So ..., but also go to see Resourceroute class:

Class Resourceroute (Abstractroute):
    "" A Route with Resource ""
    ......
    # The rest is not posted
 
I looked for half a day and found no handler () function Ah, OK, then we go to its parent class to find:

Class Abstractroute (METACLASS=ABC. Abcmeta):
    def __init__ (self, method, handler, *,
                  Expect_handler=none,
                  resource=none):
         Self._method = Method
        # Assign value here to _handler
& nbsp;       Self._handler = Handler
        ......
    # returns Self._handler
    @property
    def handler (self):
        return Self._handler
    ...
 
Haha, the original here, the little girl smashed finally found you. The original layer handler the final return of the thing is the Abstractroute class _handler, you can find that this _handler in the Abstractroute constructor assigned to it, then this abstractroute type of object when Will it be instantiated?

Now we go back to the original place, which is:

App.router.add_route (' Get ', '/index ', index)

Here, it is necessary to say that this app.router return is actually a Urldispatcher object, in the application class has a @property decorated router () function, returns the Application object's _ Router property, and _router represents a Urldispatcher object. So, the Add_route () function above is actually a member function of the Urldisparcher class. What on earth did this add_route () do? Go inside the Add_route () function:

Class Urldispatcher (Abstractrouter, collections.abc.Mapping):


......


def add_route (self, method, path, Handler, *, Name=none, Expect_handler=none):


Resource = Self.add_resource (path, name=name)


Return Resource.add_route (method, Handler,


Expect_handler=expect_handler)


......





def add_resource (self, path, *, Name=none):


If not path.startswith ('/'):


Raise ValueError ("Path should is started with/")


If not (' {' on path or '} ' in Path or self. Route_re.search (path)):


# Note that the resource object constructed here is Plainresource type.


Resource = Plainresource (path, name=name)


Self._reg_resource (Resource)


return resource





For convenience, I put the next block of code to be analyzed, which is a member function of the Urldispatcher class, anyway.


Looking at the note above, the function Add_resource () returns an object of Plainresource type, the Plainresource mentioned above has finally seen the source, and the resource object is constructed with the incoming Add_route ( ) is encapsulated in the path. And here it is:

Return Resource.add_route (method, Handler,
Expect_handler=expect_handler)

It seems that there is also a add_route () member function in the Plainresource class, and we continue F7 into Plainresource's Add_route ():




Class Resource (Abstractresource):


......


def add_route (self, method, Handler, *,expect_handler=none):


For route in Self._routes:


If Route.method = = method or Route.method = = HDRs. Meth_any:


Raise RuntimeError ("Added Route'll never be executed,"


' method ' {Route.method} is '


"Already registered". Format (Route=route))


Route = Resourceroute (method, Handler, Self,expect_handler=expect_handler)


Self.register_route (Route)


Return route


......





This function instantiates a Resourceroute object route, and passes our step-by-step method and handler (the real URL handler) into the Resourceroute constructor, so let's take a look at this Resourceroute the case of the class:

Class Resourceroute (Abstractroute):
"" "A Route with Resource" ""
def __init__ (self, method, handler, resource, *, Expect_handler=none):
Super (). __init__ (method, Handler, Expect_handler=expect_handler, Resource=resource)

Surprise found that the original Resourceroute is Abstractroute subclass, the instantiation of the need to call the parent class construction method, so we just questioned the Abstractroute class is instantiated at this time, its internal _handler Attributes are also assigned at this time, which corresponds to the index function in the following sentence,

App.router.add_route (' Get ', '/index ', index)

As a result, when we add a route, the Get,/index,index three messages will eventually be encapsulated into a Resourceroute type object and then encapsulated in layers, eventually becoming an attribute within the App object, You call this method multiple times to add other routes, and multiple Resourceroute objects are encapsulated into the app.

Well, we've finally figured out the handler problem, and it seems that handler is actually pointing at our final URL handler function.

So we go back to Handle_request ():

@asyncio. coroutine
def handle_request (self, Message, payload):
......
Handler = Match_info.handler
For factory in Reversed (self._middlewares):
Handler = yield from factory (app, Handler)
RESP = yield from handler (request)
.......

See understand it, got the matching request handler, we can rest assured call it ~ ~

Perhaps some students here have a question, is the middle of the for loop is what, I am here to explain briefly. This is actually another parameter to the assignment when initializing the app Middlewares, like this:


App = web. Application (Loop=loop, middlewares=[
        data_factory, response_factory , Logger_factory])
 
Middlewares is actually an interceptor mechanism that can be processed once before and after processing request requests, such as a uniform print request log, and so on, and its rationale is Python decorator, do not know the decorator of the students also please Google, middlewares receive a list, the list of elements is you write the Interceptor function, for the loop in reverse order the URL processing function with the interceptor decorated again. Finally, the function that has been decorated with all the interceptors is returned. This allows you to do some extra processing before you finally call the URL handler.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.