This article mainly explains in detail how to use the http module in Python and its examples. it is very meticulous. if you need it, you can refer to it for a long time before writing a blog, this is because the blogger has started another wonderful internship this year, learning and working on projects, and the time is full. thanks to the experiences of these two sections, I have made contact with golang and python, to learn different languages, you can jump out of the limitations of c/c ++ thinking, learn the excellent features of golang and python, and learn about different scenarios; I learned about linux and c/c ++ before, and soon started using golang and python;
In addition to learning how to use it, I also like to study the source code and learn the operating mechanism so that I can use these languages or frameworks, just like eating and sleeping at a time, naturally, the bottle and flask web frameworks have recently been used, so I want to see the source code of these two frameworks. However, these two frameworks are based on the http of python, therefore, this article is available;
Python http Simple example
The python http Framework consists of a server and a handler. the server is mainly used to establish a network model, for example, using epoll to listen to Sockets. the handler is used to process all ready sockets; let's take a look at the simple use of python http:
import sysfrom http.server import HTTPServer,SimpleHTTPRequestHandlerServerClass = HTTPServerHandlerClass = SimpleHTTPRequestHandlerif__name__ =='__main__': port = int(sys.argv[2]) server_address = (sys.argv[1],port) httpd = ServerClass(server_address,HandlerClass)sa=httpd.socket.getsockname()print("Serving HTTP on",sa[0],"port",sa[1],"...")try: httpd.serve_forever() except KeyboardInterrupt:print("\nKeyboard interrupt received, exiting.") httpd.server_close() sys.exit(0)
Run the preceding example to obtain the following information:
python3 myhttp.py 127.0.0.1 9999
If you create an index.html file in front of the folder, you can access the index.html page through http: // 127.0.0.1: 9999/index.html.
In this example, the server class uses HTTPServer and the handler class is SimpleHTTPRequestHandler. Therefore, when the HTTPServer listens for a request, it will throw the request to the SimpleHTTPRequestHandler class for processing. OK, after learning about this, we start to analyze the server and handler respectively.
Http server
The design of the http Module fully utilizes the object-oriented inheritance polymorphism, because I have read the code of the tfs file system before, so it is not so much pressure when I look at python http; first, the inheritance relationship of the server is given.
+ ------------------ ++ ------------ + | Tcpserver base class | BaseServer + --------> | enable event loop listening | + ----- + ------ + | process client requests | + ------------------ + v + ----------------- + + ------------ + | httpserver base class | TCPServer + --------> + set listening socket | + ----- + ------ + | enable listening | + ----------------- + v + ---------- + | HTTPServer | + ------------ +
Shows the inheritance relationship, where BaseServer and TCPServer are in the socketserver. py file and HTTPServer in http/server. py; let's take a look at BaseServer;
BaseServer
Because BaseServer is the base class of all servers, BaseServer abstracts the commonalities of all servers as much as possible. for example, enabling the event listening loop is the commonality of each server, this is also the main function of BaseServer. let's take a look at the main code of BaseServer.
defserve_forever(self, poll_interval=0.5): self.__is_shut_down.clear()try:with_ServerSelector()asselector: selector.register(self, selectors.EVENT_READ)whilenotself.__shutdown_request: ready = selector.select(poll_interval)ifready: self._handle_request_noblock() self.service_actions()finally: self.__shutdown_request = False self.__is_shut_down.set()
The selector in the code encapsulates io multiplexing of select, poll, epoll, and so on, and then registers the socket listening to the service itself to io multiplexing to enable event listening, when a client is connected, self. _ handle_request_noblock () to process the request. Next, let's take a look at what this processing function has done;
def_handle_request_noblock(self):try: request, client_address = self.get_request()exceptOSError:returnifself.verify_request(request, client_address):try: self.process_request(request, client_address)except: self.handle_error(request, client_address) self.shutdown_request(request)else: self.shutdown_request(request)
The _ handle_request_noblock function is an internal function. First, it receives client connection requests. the underlying layer encapsulates the system call the accept function, verifies the request, and finally calls process_request to process the request; here, get_request is a subclass method, because tcp and udp receive client requests are different (tcp has a connection, udp has no connection)
Next, let's look at what process_request has done;
defprocess_request(self, request, client_address): self.finish_request(request, client_address) self.shutdown_request(request)# -------------------------------------------------deffinish_request(self, request, client_address): self.RequestHandlerClass(request, client_address, self)defshutdown_request(self, request): self.close_request(request)
The process_request function first calls finish_request to process a connection. after processing, it calls the shutdown_request function to close the connection. the finish_request function instantiates a handler class, the socket and address of the client are passed in, indicating that the handler class completes request processing at the end of initialization, which will be further analyzed when handler is analyzed;
The above is what BaseServer does. this BaseServer cannot be used directly, because some functions have not yet been implemented, just as the abstract layer of tcp/udp. Summary:
Call serve_forever to enable event listening;
Then, when a client request arrives, the request is sent to handler for processing;
TCPServer
Based on the features abstracted by the above BaseServer, we can know that the functions that TCPServer or UDPServer should complete are: initialize the listening socket, bind the listener, and finally receive the client when there is a client request; let's take a look at the code
BaseServer==>def__init__(self, server_address, RequestHandlerClass):"""Constructor. May be extended, do not override.""" self.server_address = server_address self.RequestHandlerClass = RequestHandlerClass self.__is_shut_down = threading.Event() self.__shutdown_request = False#--------------------------------------------------------------------------------TCPServer==>def__init__(self, server_address, RequestHandlerClass, bind_and_activate=True): BaseServer.__init__(self, server_address, RequestHandlerClass) self.socket = socket.socket(self.address_family, self.socket_type)ifbind_and_activate:try: self.server_bind() self.server_activate()except: self.server_close()raise
When TCPServer is initialized, it first calls the initialization function of BaseServer of the base class, initializes the server address and handler class, then initializes its own listening socket, and finally calls server_bind to bind the socket, server_activate to listen to the socket.
defserver_bind(self):ifself.allow_reuse_address: self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.socket.bind(self.server_address) self.server_address = self.socket.getsockname()defserver_activate(self): self.socket.listen(self.request_queue_size)
TCPServer also implements another function, that is, to receive client requests,
defget_request(self):returnself.socket.accept()
If you have learned linux programming before, you should be familiar with the code because the function name is the same as the system call name provided by Linux;
TCPServer has actually set up the tcp-based server Main framework. Therefore, HTTPServer only reloads the server_bind function and sets the reuse_address function on the basis of inheriting the TCPServer;
OK. here we will analyze the opening process of the above example program;
Httpd = ServerClass (server_address, HandlerClass) when this line of code initializes HTTPServer, it mainly calls the initialization method of the base class TCPServer, initializes the listening socket, and binds and listens;
Httpd. serve_forever () this line of code calls the serve_forever method of BaseServer of the base class to enable the listener loop and wait for the client connection;
If you have read the source code of redis or some background components, you should be familiar with this concurrency model. OK, after analyzing the server, let's look at how handler processes client requests.
Http handler
Handler class mainly analyzes handler at the tcp layer and handler at the http application layer. handler at the tcp layer cannot be used because the tcp layer is only responsible for transmitting bytes, however, I do not know how to parse and process received bytes. therefore, to use the TCP protocol, the application layer must inherit the TCP handler and implement the handle function. for example, handler at the http layer implements the handle function, parses the http protocol, processes Business requests and returns results to the client. First, let's look at the handler at the tcp layer.
Tcp layer handler
The handler at the tcp layer mainly includes BaseRequestHandler and StreamRequestHandler (both in the socketserver. py file). let's take a look at the BaseRequestHandler code first,
classBaseRequestHandler:def__init__(self, request, client_address, server): self.request = request self.client_address = client_address self.server = server self.setup()try: self.handle()finally: self.finish()defsetup(self):passdefhandle(self):passdeffinish(self):pass
When looking at the server, we know that the client request processing is completed in the handler class initialization function. by using this base class initialization function, we know that the request processing process has gone through three steps:
Setup makes some settings for the client socket;
Handle: the function that actually processes the request;
Close socket read/write requests;
This BaseRequestHandler is the base class of handler top level. it only abstracts the overall framework of handler and does not actually process it. let's take a look at tcp handler,
classStreamRequestHandler(BaseRequestHandler): timeout = None disable_nagle_algorithm = Falsedefsetup(self): self.connection = self.requestifself.timeoutisnotNone: self.connection.settimeout(self.timeout)ifself.disable_nagle_algorithm: self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, True) self.rfile = self.connection.makefile('rb', self.rbufsize) self.wfile = self.connection.makefile('wb', self.wbufsize)deffinish(self):ifnotself.wfile.closed:try: self.wfile.flush()exceptsocket.error:pass self.wfile.close() self.rfile.close()
Tcp handler implements the setup and finish functions. the setup function sets the timeout time, enables the nagle algorithm, and sets the socket read/write cache. The finish function disables socket read/write;
According to the handler of the preceding two tcp layers, to implement an http-based server handler, you only need to inherit the StreamRequestHandler class and implement the handle function; this is also the main task of handler at the http layer;
Http layer handler
According to the introduction of the tcp-layer handler, we know that the http-layer handler inherits the tcp-layer handler and mainly implements the handle function to process client requests. check the code directly;
defhandle(self): self.close_connection = True self.handle_one_request()whilenotself.close_connection: self.handle_one_request()
This is the BaseHTTPRequestHandler's handle function. in the handle function, the handle_one_request function is called to process a request. by default, it is a short link. Therefore, after a request is executed, it will not go into the while loop and process the next request on the same connection, but it will be judged inside the handle_one_request function. if the connection in the request header is keep_alive or the http version is greater than or equal to 1.1, you can maintain a long link. Next, let's look at how the handle_one_request function handles it;
defhandle_one_request(self):try:self.raw_requestline =self.rfile.readline(65537)iflen(self.raw_requestline) >65536:self.requestline =''self.request_version =''self.command =''self.send_error(HTTPStatus.REQUEST_URI_TOO_LONG)returnifnotself.raw_requestline:self.close_connection = Truereturnifnotself.parse_request():return mname = 'do_'+self.commandifnothasattr(self, mname):self.send_error( HTTPStatus.NOT_IMPLEMENTED,"Unsupported method (%r)"%self.command)return method = getattr(self, mname) method()self.wfile.flush() except socket.timeout as e:self.log_error("Request timed out: %r", e)self.close_connection = Truereturn
The handle_one_request execution process is as follows:
First, call parse_request to parse the http request content of the client.
Use "do _" + command to construct the function method for the request
Call the method function to process the business and return response to the client.
This BaseHTTPRequestHandler is an http handler base class, so it cannot be used directly because it does not define the request processing function, that is, the method function. Fortunately, python provides us with a simple SimpleHTTPRequestHandler, this class inherits BaseHTTPRequestHandler and implements the request function. let's look at the get function:
# SimpleHTTPRequestHandler# ---------------------------------------------defdo_GET(self):"""Serve a GET request.""" f = self.send_head()iff:try: self.copyfile(f, self.wfile)finally: f.close()
This get function first calls the do_GET function to return the response header to the client and the requested file. Finally, it calls the copyfile function to return the request file to the client through connection;
The above is the basic content of the http Module. Finally, we will summarize the handler section of the example program:
The server sends the request to SimpleHTTPRequestHandler to initialize the function;
SimpleHTTPRequestHandler sets the connection of the client in the initialization part;
Then, call the handle function to process the request;
The handle function then calls handle_one_request to process the request;
In the handle_one_request function, parse the request and find the request processing function;
The previous example belongs to getexample. For this reason, we directly call do_getexample to return the index.html file to the client;
The analysis of the python http module has ended. I don't know if anyone has found that the http module of python is not very convenient to use, because it calls the request function through the request method, in this way, when the same method is called for a large number of times, such as the get and post methods, this request function will be very huge, the code is not easy to write, and various situations can be judged; of course, SimpleHTTPRequestHandler is just a simple example provided by python;
Of course, python officially provides a better framework for http, that is, wsgi server and wsgi application. Next, the article first analyzes the wsgiref module and bottle that comes with python, and then analyzes flask;
For more details about the http module that comes with python, refer to the PHP Chinese network!