The application. py module in the Python web. py framework is described in detail.
This article mainly analyzes the code in the application. py module of the web. py library. In general, this module mainly implements WSGI compatible interfaces so that applications can be called by the WSGI application server. WSGI is the abbreviation of Web Server Gateway Interface. For details, refer to the wsgi wiki page.
Interface Usage
Use the HTTP Server that comes with web. py
The following example is from the official Article Hello World. This code is generally the code of the application entry:
import weburls = ("/.*", "hello")app = web.application(urls, globals())class hello: def GET(self): return 'Hello, world!'if __name__ == "__main__": app.run()
The preceding example describes the most basic components of a web. py application:
- URL route table
- A web. application Instance app
- Call app. run ()
The call of app. run () is to initialize various WCGI interfaces and start a built-in HTTP server to connect to these interfaces. The Code is as follows:
def run(self, *middleware): return wsgi.runwsgi(self.wsgifunc(*middleware))
Connection with WSGI Application Server
If your application needs to be connected to the WSGI application server, such as uWSGI and gunicorn, the application entry code should be written in another way:
import webclass hello: def GET(self): return 'Hello, world!'urls = ("/.*", "hello")app = web.application(urls, globals())application = app.wsgifunc()
In this scenario, the application code does not need to start the HTTP server, but implements a WSGI compatible interface for the WSGI server to call. Web. the py framework implements this interface for us. You only need to call application = app. wsgifunc () is enough. The application variable obtained here is the WSGI interface (you will know after analyzing the Code ).
Implementation Analysis of WSGI Interface
The analysis mainly involves the following two lines of code:
app = web.application(urls, globals())application = app.wsgifunc()
Web. application instantiation
To initialize this instance, two parameters are required: URL routing tuples and globals.
In addition, the third variable autoreload can be passed to specify whether to automatically re-import the Python module. This is useful during debugging, but we can ignore it when analyzing the main process.
The application class initialization code is as follows:
class application: def __init__(self, mapping=(), fvars={}, autoreload=None): if autoreload is None: autoreload = web.config.get('debug', False) self.init_mapping(mapping) self.fvars = fvars self.processors = [] self.add_processor(loadhook(self._load)) self.add_processor(unloadhook(self._unload)) if autoreload: ...
The code for autoreload-related functions is omitted. Other codes mainly do the following:
- Self. init_mapping (mapping): initializes the URL route ing.
- Self. add_processor (): added two processors.
Initialize URL route ing
def init_mapping(self, mapping): self.mapping = list(utils.group(mapping, 2))
This function also calls a tool function. The effect is as follows:
urls = ("/", "Index", "/hello/(.*)", "Hello", "/world", "World")
If the tuples passed during initialization are as follows, after init_mapping is called:
self.mapping = [["/", "Index"], ["/hello/(.*)", "Hello"], ["/world", "World"]]
The later frame will traverse this list when performing URL routing.
Add Processor
self.add_processor(loadhook(self._load)) self.add_processor(unloadhook(self._unload))
The two lines of code added two processors: self. _ load and self. _ unload, and decorated these two functions. The processor is used before and after HTTP request processing. It is not really used to process an HTTP request, but can be used for some additional work, for example, the practice of adding a session to a sub-application mentioned in the official tutorial is to use a processor:
def session_hook(): web.ctx.session = sessionapp.add_processor(web.loadhook(session_hook))
The definition and use of processors are complicated.
Wsgifunc Function
The execution result of wsgifunc is a WSGI-compatible function that implements URL routing and other functions.
def wsgifunc(self, *middleware): """Returns a WSGI-compatible function for this application.""" ... for m in middleware: wsgi = m(wsgi) return wsgi
Apart from the definition of internal functions, the definition of wsgifunc is so simple. If no middleware is implemented, the wsgi function defined inside wsgifunc is directly returned.
Wsgi Functions
This function implements WSGI compatible interfaces and URL routing.
def wsgi(env, start_resp): # clear threadlocal to avoid inteference of previous requests self._cleanup() self.load(env) try: # allow uppercase methods only if web.ctx.method.upper() != web.ctx.method: raise web.nomethod() result = self.handle_with_processors() if is_generator(result): result = peep(result) else: result = [result] except web.HTTPError, e: result = [e.data] result = web.safestr(iter(result)) status, headers = web.ctx.status, web.ctx.headers start_resp(status, headers) def cleanup(): self._cleanup() yield '' # force this function to be a generator return itertools.chain(result, cleanup())for m in middleware: wsgi = m(wsgi)return wsgi
Next we will analyze this function carefully:
self._cleanup() self.load(env)
Self. _ cleanup () calls utils internally. threadedDict. clear_all (): clears all the thread local data to avoid Memory leakage (because of web. A lot of data in the py framework is stored in the thread local variable ).
Self. load (env) uses the parameters in env to initialize web. ctx variables. These variables cover the information of the current request and may be used in applications, such as web. ctx. fullpath.
try: # allow uppercase methods only if web.ctx.method.upper() != web.ctx.method: raise web.nomethod() result = self.handle_with_processors() if is_generator(result): result = peep(result) else: result = [result] except web.HTTPError, e: result = [e.data]
This section mainly calls self. handle_with_processors (), which routes the requested URL and finds a suitable class or sub-application to process the request, it also calls the added processor to do some other work (for the processor part, I will talk about it later ). There are three possible methods for processing the returned results:
- If an iteratable object is returned, security iteration is performed.
- If other values are returned, a list object is created for storage.
- If an HTTPError exception is thrown (for example, we use raise web. OK ("hello, world") to return the result), the data in the exception is e. data is encapsulated into a list.
-
result = web.safestr(iter(result)) status, headers = web.ctx.status, web.ctx.headers start_resp(status, headers) def cleanup(): self._cleanup() yield '' # force this function to be a generator return itertools.chain(result, cleanup())
In the following code, the result of the returned list is stringized to obtain the HTTP Response body. Then, perform the following two things according to the WSGI specification:
- Call the start_resp function.
- Convert the result into an iterator.
Now you can see that the previously mentioned application = app. wsgifunc () is to assign the wsgi function to the application variable, so that the application server can use the WSGI standard to connect to our application.
Process HTTP requests
The Code analyzed above has explained how the web. py framework implements the WSGI compatible interface, that is, we already know the process of HTTP request arriving at the framework and returning from the framework to the application server. How does the framework call our application code to process a request? This requires a detailed analysis of the process of adding and calling the ignored processor.
Loadhook and unloadhook decorators
These two functions are the decorator functions of the real processor function (although they are not used by the @ operator of the decorator), the resulting processor corresponds to the request processing (loadhook) respectively) and after request processing (unloadhook ).
Loadhook
def loadhook(h): def processor(handler): h() return handler() return processor
This function returns a processor, which will ensure that the processor function h you provided is called first, and then the subsequent operation function handler is called.
Unloadhook
def unloadhook(h): def processor(handler): try: result = handler() is_generator = result and hasattr(result, 'next') except: # run the hook even when handler raises some exception h() raise if is_generator: return wrap(result) else: h() return result def wrap(result): def next(): try: return result.next() except: # call the hook at the and of iterator h() raise result = iter(result) while True: yield next() return processor
This function also returns a processor, which first calls the handler passed in by the parameter and then calls the processor function you provided.
Handle_with_processors Function
def handle_with_processors(self): def process(processors): try: if processors: p, processors = processors[0], processors[1:] return p(lambda: process(processors)) else: return self.handle() except web.HTTPError: raise except (KeyboardInterrupt, SystemExit): raise except: print >> web.debug, traceback.format_exc() raise self.internalerror() # processors must be applied in the resvere order. (??) return process(self.processors)
This function is quite complex, and the core part is Recursive Implementation (I feel that the same function can be implemented without recursion ). For clarity, use the instance description.
As mentioned above, when initializing an application instance, two processors will be added to self. processors:
self.add_processor(loadhook(self._load)) self.add_processor(unloadhook(self._unload))
Therefore, the current self. processors looks like this:
self.processors = [loadhook(self._load), unloadhook(self._unload)]
# For convenience of subsequent instructions, we can Abbreviation:
self.processors = [load_processor, unload_processor]
When the framework starts to execute handle_with_processors, these processors are executed one by one. Let's take a look at Code Decomposition. First, we can simplify the handle_with_processors function:
Def handle_with_processors (self): def process (processors): try: if processors: # Location 2 p, processors = processors [0], processors [1:] return p (lambda: process (processors) # location 3 else: return self. handle () # location 4 TB web. HTTPError: raise... # processors must be applied in the resvere order. (??) Return process (self. processors) # location 1
The start point of function execution is position 1, and its internal definition function process (processors) is called ).
If Location 2 determines that the processor list is not empty, enter if.
In location 3, call the processor function to be executed this time. The parameter is a lambda function and then return.
If Location 2 determines that the processor list is empty, execute self. handle (). This function actually calls our application code (as described below ).
In the preceding example, there are currently two processors:
self.processors = [load_processor, unload_processor]
After entering the code from location 1, Location 2 will judge that there is still a processor to execute, and it will go to location 3. At this time, the code to be executed is like this:
return load_processor(lambda: process([unload_processor]))
The load_processor function is a function decorated by loadhook. Therefore, it is defined as follows during execution:
Def load_processor (lambda: process ([unload_processor]): self. _ load () return process ([unload_processor]) # It is the lambda function of the parameter.
The system will first execute self. _ load () and then continue to execute the process function. It will still go to position 3. The code to be executed is as follows:
return unload_processor(lambda: process([]))
The unload_processor function is a function decorated by the unloadhook. Therefore, it is defined as follows during execution:
Def unload_processor (lambda: process ([]): try: result = process ([]) # The lambda function is_generator = result and hasattr (result, 'Next') passed in by the Parameter ') failed T: # run the hook even when handler raises some exception self. _ unload () raise if is_generator: return wrap (result) else: self. _ unload () return result
Now we will first execute the process ([]) function and go to location 4 (call self. handle () to obtain the processing result of the application, and then call the processing function self of the current processor. _ unload ().
Summarize the execution sequence:
self._load() self.handle() self._unload()
If there are more processors, follow this method. For the loadhook decoration processor, add the first execution, for the unloadhook decoration processor, and then add the first execution.
Handle Function
After talking about so much, we can talk about the places where we really want to call the code we write. After all the load processors are executed, they will execute the self. handle () function, which internally calls the application code we write. For example, a hello or world is returned. Self. handle is defined as follows:
def handle(self): fn, args = self._match(self.mapping, web.ctx.path) return self._delegate(fn, self.fvars, args)
This function is easy to understand. The first line calls self. _ match is the routing function. Find the corresponding class or sub-application and the second line of self. _ delegate is to call this class or pass the request to the sub-application.
_ Match Function
The _ match function is defined as follows:
Def _ match (self, mapping, value): for pat, what in mapping: if isinstance (what, application): # position 1 if value. startswith (pat): f = lambda: self. _ delegate_sub_application (pat, what) return f, None else: continue elif isinstance (what, basestring): # Location 2 what, result = utils. re_subm ('^' + pat + '$', what, value) else: # location 3 result = utils. re_compile ('^' + pat + '$ '). match (value) if result: # it's a match return what, [x for x in result. groups ()] return None, None
In the parameter of this function, mapping is self. mapping, which is the URL route ing table, and value is web. ctx. path, which is the Request path. This function traverses self. mapping and processes objects based on the type of objects processed in the ing relationship:
- Location 1: The processing object is an application instance, that is, a sub-application. An anonymous function is returned, which calls self. _ delegate_sub_application for processing.
- Location 2. If the processing object is a string, utils is called. re_subm is processed. Here the value (that is, web. ctx. in path), replace the matching part with "what" (that is, the processing object string of the specified URL mode ), then return the replaced result and matched items (a re. matchObject instance ).
- Location 3. In other cases, for example, you can specify a class object as the processing object.
If the result is not empty, the processing object and a parameter list are returned (this parameter list is the parameter passed to the implemented GET function ).
_ Delegate Function
The result returned from the _ match function is passed as a parameter to the _ delegate function:
fn, args = self._match(self.mapping, web.ctx.path)return self._delegate(fn, self.fvars, args)
Where:
- Fn: the object to process the current request. It is generally a class name.
- Args: the parameter to be passed to the request processing object.
- Self. fvars: The global namespace used to find and process objects when the application is instantiated.
The implementation of the _ delegate function is as follows:
def _delegate(self, f, fvars, args=[]): def handle_class(cls): meth = web.ctx.method if meth == 'HEAD' and not hasattr(cls, meth): meth = 'GET' if not hasattr(cls, meth): raise web.nomethod(cls) tocall = getattr(cls(), meth) return tocall(*args) def is_class(o): return isinstance(o, (types.ClassType, type)) if f is None: raise web.notfound() elif isinstance(f, application): return f.handle_with_processors() elif is_class(f): return handle_class(f) elif isinstance(f, basestring): if f.startswith('redirect '): url = f.split(' ', 1)[1] if web.ctx.method == "GET": x = web.ctx.env.get('QUERY_STRING', '') if x: url += '?' + x raise web.redirect(url) elif '.' in f: mod, cls = f.rsplit('.', 1) mod = __import__(mod, None, None, ['']) cls = getattr(mod, cls) else: cls = fvars[f] return handle_class(cls) elif hasattr(f, '__call__'): return f() else: return web.notfound()
This function mainly performs different processing based on the type of parameter f:
- If f is null, 302 Not Found is returned.
- F is an application instance, and the handle_with_processors () of the sub-application is called for processing.
- F is a Class Object and calls the internal function handle_class.
- If f is a string, it will be redirected, or the handle_class will be called for processing after obtaining the Class Name of the request to be processed (the code we write is generally called under this branch ).
- F is a callable object and can be called directly.
- In other cases, 302 Not Found is returned.