Python Asynchronous IO---Easily manage 10k+ concurrent connections

Last Update:2015-04-16 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Foreword asynchronous operation is a universal concept in computer software and hardware system, which is rooted in the obvious differences in the processing speed of various entities involved in collaboration. The majority of software development encountered in the CPU and IO speed mismatch, so asynchronous IO exists in various programming frameworks, such as browser, server, such as node. js. This paper mainly analyzes Python asynchronous Io. The Python 3.4 Standard library has a new module, Asyncio, to support asynchronous Io, although the current API status is provisional, which means that backward compatibility is not guaranteed and may even be removed from the standard library (very low probability). If the focus on PEP and Python-dev will find the module brewing for a long time, there may be a follow-up to the API and implementation of the adjustment, but undoubtedly asyncio is very practical and powerful, it is worth learning and delve into. The example Asyncio primarily deal with TCP/UDP socket communications, managing large numbers of connections without creating a large number of threads to improve system efficiency. An example of an official document is simply remodeled here to implement an HTTP long connection benchmark tool to diagnose the Web server's long connection processing capability. Feature Overview: Create 10 connections every 10 milliseconds until the number of target connections (such as 10k), and each connection will periodically send a head request to the server to maintain HTTP Keepavlie. The code is as follows:

Click ( here) to collapse or open

Import Argparse
Import Asyncio
Import Functools
Import logging
Import Random
Import Urllib.parse
loop = Asyncio.get_event_loop ()
@asyncio. coroutine
Def print_http_headers (no, URL, keepalive):
url = urllib.parse.urlsplit (URL)
Wait_for = Functools.partial (Asyncio.wait_for, timeout=3, Loop=loop)
query = (' HEAD {url.path} http/1.1\r\n '
' Host: {url.hostname}\r\n '
' \ r \ n '). Format (Url=url). Encode (' Utf-8 ')
Rd, WR = yield from wait_for (Asyncio.open_connection (Url.hostname, 80))
While True:
Wr.write (query)
While True:
Line = yield from wait_for (Rd.readline ())
If not line: # End of connection
Wr.close ()
Return No
line = Line.decode (' Utf-8 '). Rstrip ()
If not line: # End of header
Break
Logging.debug (' (%d) HTTP header>%s '% (no, line))
Yield from Asyncio.sleep (Random.randint (1, KEEPALIVE//2))
@asyncio. coroutine
def do_requests (args):
Conn_pool = set ()
Waiter = Asyncio. Future ()
def _on_complete (fut):
Conn_pool.remove (FUT)
EXC, res = Fut.exception (), Fut.result ()
If exc is not None:
Logging.info (' conn#{} exception '. Format (EXC))
Else
Logging.info (' conn#{} result '. Format (res))
If not conn_pool:
Waiter.set_result (' event loop is done ')
For I in Range (args.connections):
FUT = Asyncio.async (Print_http_headers (i, Args.url, args.keepalive))
Fut.add_done_callback (_on_complete)
Conn_pool.add (FUT)
If I% 10 = = 0:
Yield from asyncio.sleep (0.01)
Logging.info (yield from waiter)
def main ():
Parser = Argparse. Argumentparser (description= ' asyncli ')
Parser.add_argument (' url ', help= ' page address ')
Parser.add_argument ('-C ', '--connections ', Type=int, Default=1,
help= ' number of connections simultaneously ')
Parser.add_argument ('-K ', '--keepalive ', Type=int, default=60,
help= ' HTTP keepalive timeout ')
args = Parser.parse_args ()
Logging.basicconfig (level=logging.info, format= '% (asctime) s% (message) s ')
Loop.run_until_complete (do_requests (args))
Loop.close ()
if __name__ = = ' __main__ ':
Main ()

Test and Analysis Hardware: CPU 2.3GHZ/2 cores,ram 2GB Software: CentOS 6.5 (Kernel 2.6.32), Python 3.3 (pip install Asyncio), nginx 1.4.7 parameter settings: Ulimi T-n 10240;nginx worker connection number to 10240 start the Web server, just a worker process:

# .. /sbin/nginx
# PS Ax | grep nginx
2007? Ss 0:00 Nginx:master Process: /sbin/nginx
2008? S 0:00 Nginx:worker Process

Start the benchmark tool, initiate 10k connections, and the destination URL is the default test page for Nginx:

$ python asyncli.py http://10.211.55.8/-C 10000

Nginx Log Statistics average number of requests per second:

# tail-1000000 Access.log | awk ' {print $4} ' | Sort | uniq-c | awk ' {cnt+=1; sum+=$1} END {printf "avg =%d\n", sum/cnt} '
AVG = 548

Top Partial output:

VIRT RES SHR S%cpu%MEM time+ COMMAND
657m 115m 3860 R 60.2 6.2 4:30.02 python
54208 10m 848 R 7.0 0.6 0:30.79 Nginx

Summary: 1. Python is simple and straightforward to implement. Less than 80 lines of code, only used in the standard library, logic intuitive, imagine the C + + standard library to implement these functions, circumnavigated "Life is too short, I use Python." 2. Python is inefficient to run. When the connection is established, the client and the service side of the data transmission logic is similar, look at the top output, Python CPU and RAM occupy the basic is 10 times times the nginx, meaning the efficiency difference 100 times times (CPU x RAM), the side illustrates the efficiency gap between Python and C. Although the contrast is some extreme, after all, nginx not only use C and for the Cpu/ram occupation did a depth optimization, but similar task efficiency difference of two orders of magnitude, unless it is a bug, the starting point of architecture design is different, Python first readable and easy to use and performance second, Nginx is a highly optimized Web server, the development of a module is more troublesome, to reuse its asynchronous framework, it is simply more difficult. The tradeoff between development efficiency and operational efficiency is always there. 3. Single-threaded asynchronous IO v.s. Multithreading synchronous IO. The above example is single-threaded asynchronous IO, in fact, do not write the demo will know that multithreading synchronous IO is much less efficient, one connection per thread? 10k threads, only the line stacks occupy 600+MB (64KB * 10000) memory, plus the thread context switch and Gil, basically is a nightmare. ayncio Core Concepts The following are four core concepts that need to be understood when learning Asyncio, see < reference >1 for more details. Event loop. The key to a single-threaded implementation of Asynchrony is the high-level event loop, which is executed synchronously. 2. Future. Asynchronous IO has a lot of asynchronous tasks, and each asynchronous task is controlled by a future. 3. Coroutine. The specific execution logic of each asynchronous task is represented by a coroutine. 4. Generator (yield & yield from). The extensive use in Asyncio is a grammatical detail that cannot be ignored. Reference 1. asyncio–asynchronous I/O, event Loop, Coroutines and tasks, https://docs.python.org/3/ LIBRARY/ASYNCIO.HTML2. PEP 3156, asynchronous IO support Rebooted:the "Asyncio" Module, http://legacy.pytHon.org/dev/peps/pep-3156/3. PEP 380, syntax for delegating to a SUBGENERATOR,&NBSP;HTTP://LEGACY.PYTHON.ORG/DEV/PEPS/PEP-0380/4. PEP 342, coroutines via Enhanced GENERATORS,&NBSP;HTTP://LEGACY.PYTHON.ORG/DEV/PEPS/PEP-0342/5. PEP 255, simple GENERATORS,&NBSP;HTTP://LEGACY.PYTHON.ORG/DEV/PEPS/PEP-0255/6. Asyncio Source code, http://hg.python.org/cpython/file/3.4/lib/asyncio/

Python Asynchronous IO---Easily manage 10k+ concurrent connections

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More