Python-experience urllib3-HTTP connection pool Application

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

You can download the relevant libraries and materials through the http://code.google.com/p/urllib3.

First, list the usage methods:

# Coding = utf8import urllib3import datetimeimport timeimport urllib # create a connection pool http_pool = urllib3.httpconnectionpool ('ent .qq.com ') to connect to a specific host # obtain the start time strstart = time. strftime ('% x % Z') for I in range (0,100, 1): print I # combine URL string url = 'HTTP: // ent.qq.com/a/20151116/10906d.htm' % I print URL # Start to synchronously obtain the content r = http_pool.urlopen ('get', URL, redirect = false) print R. status, R. headers, Len (R. data) # print time 'start time: ', strstartprint 'end time:', time. strftime ('% x % Z ')

Relatively simple: first establish the connection pool http_pool, and then continuously obtain the URL resources of the same host ('ent .qq.com.
Capture packets through Wireshark:

All the SRC ports corresponding to http://ent.qq.com/a/20151116/???#=}.htm=are 13136, which indicates that the port is reused.
The keep-alive feature should be used according to the urllib3 document, and all the connection fields of repond are keep-alive.

How can this connection pool be implemented?


Def urlopen (self, method, URL, body = none, headers = none, retries = 3, redirect = true, assert_same_host = true): # remove many condition judgment statements try: # obtain connection conn = self. _ get_conn () # combined request self. num_requests + = 1 Conn. request (method, URL, body = body, headers = headers) # set timeout Conn. sock. setTimeout (self. timeout) httplib_response = Conn. getresponse ()#...... # parse httprespond response = httpresponse. from_httplib (httplib_response) # Put the current connection into the queue for reuse of self. _ put_conn (conn) handle T # error handling... # redirect processing. Here is the recursive if (redirect and response. status in [301,302,303,307] And 'location' in response. headers): # redirect, retry log.info ("redirecting % s-> % s" % (URL, response. headers. get ('location') return self. urlopen (method, response. headers. get ('location'), body, headers, retries-1, redirect, assert_same_host) # Return response
As you can see from the simplified code above, first get the connection, then construct the request, get the request, and then get the respond.
Note that each connection is established by calling _ get_conn
After the connection is established, the _ put_conn method is called and put into the connection pool. The related code is as follows:
Def _ new_conn (Self): # create a connection return httpconnection (host = self. host, Port = self. port) def _ get_conn (self, timeout = none): # Try to get the connection conn = none try: conn = self from the pool. pool. get (Block = self. block, timeout = timeout) # determine if the connection has been established? If conn And Conn. sock and select ([Conn. sock], [], [], 0.0) [0]: # Either data is buffered (bad), or the connection is dropped. log. warning ("connection pool detected dropped" "connection, resetting: % s" % self. host) Conn. close () failed t empty, E: pass # Oh well, we'll create a new connection then # If the queue is empty or the connection in the queue is disconnected, create a connection on the same port return conn or self. _ new_conn () def _ put_conn (self, Conn): # Put the current connection into the queue. Of course, the default maximum element size of this pair of columns is 1. If it exceeds this size, then it is discarded. Try: Self. pool. put (Conn, block = false) fail t full, E: # This shoshould never happen if self. block = true log. warning ("httpconnectionpool is full, discarding connection: % s" % self. host)
Through the above pool and the general urllib library for testing performance, continuous access to different webpages of the same domain name, the speed is not significantly improved, probably because the server is relatively close to the local, the main optimization of the pool is to reduce the number of TCP handshakes and the number of slow start times, which is not well reflected.
I do not know any good methods for Performance Testing suggestions?
It is also mentioned whether to provide a connection pool in urllib3 to automatically create a pool for each host when accessing different websites, that is, httpocean

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python-experience urllib3-HTTP connection pool Application

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python-experience urllib3-HTTP connection pool Application

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support