Python has two features as follows:
Explanatory language
Gil Global Interpreter Lock
The former causes its performance to be naturally in the compiled language to lag behind a lot of performance. The latter, in the era of multi-core parallel computing, greatly limits the Python application scenario.
However, with a reasonable web framework, Python can be used to avoid weaknesses, and still be able to maintain its efficient development productivity in the multi-core parallel era, while also performing well in performance. For example, the Tornado framework.
The tornado framework has mainly done the following several things:
Use a single-threaded approach to avoid the performance overhead of thread switching while avoiding thread insecurity when using some function interfaces
Supports asynchronous non-blocking network IO Model to avoid main process blocking waiting
Previous experiments
There are many web frameworks based on the Python language, but the mainstream of "Django" and "Tornado" can basically represent their implementation concepts.
Because the focus of this article is to compare synchronization and Asynchrony. So on the performance of the different web framework comparison experiment, the reference to a Netizen's post experiment results.
reference article [1]: Lightweight Web server Tornado code Analysis
Some parts of this article are written in a nutshell, but let's make a bold assumption that the author uses a different Python web framework to implement the most basic HelloWorld code.
The tornado implementation of the reference is as follows:
Import Tornado.ioloop
Import Tornado.web
Class MainHandler (Tornado.web.RequestHandler):
def get (self):
Self.write ("Hello, World")
application = Tornado.web.Application ([
(r "/", MainHandler),
])
if __name__ = = "__main__":
Application.listen (8888)
Tornado.ioloop.IOLoop.instance (). Start ()
Finally, using Apache Benchmark (AB), the following instructions were used on the other machine for load testing:
1ab-n 100000-c http://10.0.1.x/
On the AMD Opteron 2.4GHz four nuclear machines, the results are shown in the following illustration:
Compared with the second-fastest server, Tornado is 4 times times more likely to perform its data. Even with only one CPU core of the bare running mode, Tornado also has a 33% advantage.
According to the author of the citation: Tornado is a complete abuse of other web frameworks.
This article reviews: This experiment only temporarily lets everybody establish the macroscopic to the different Web frame performance understanding, as far as the credibility is doubtful, because the experiment report writes is not very normative, the detail omits too many. The point of this article is that if you're using synchronous notation, the difference in performance between Tornado and Django should not be that big. Of course this is less important, and the synchronization and Asynchrony mentioned later are more important.
The following is the focus of this article, performance testing and variance comparisons for synchronous and asynchronous network IO.
Test environment
Environment
Cpu:core i3
Operating system: Ubuntu 14.0
Python framework: py2.7
Web server: Tornado 4.2.0, server only enable one core
Content
Use synchronous and asynchronous methods to write a delay code and then use Apachebench for stress testing:
Concurrent Volume 40
Total Request Amount 200
Because this article is only doing performance contrast, not performance of the upper limit contrast, so all use is relatively less pressure.
Synchronous and asynchronous code
Class Syncsleephandler (RequestHandler):
"""
Synchronous mode, a delay of 1s interface
"""
def get (self):
Time.sleep (1)
Self.write ("When I Sleep 5s")
Class Sleephandler (RequestHandler):
"""
Asynchronous 1-Second delay interface
"""
@tornado. Gen.coroutine
def get (self):
Yield Tornado.gen.Task (
Tornado.ioloop.IOLoop.instance (). Add_timeout,
Time.time () + 1
)
Self.write ("When I Sleep 5s")
Synchronizing test results
/Ab-n 200-c http://localhost:8009/demo/syncsleep-handler/
This is apachebench, Version 2.3 < $Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to the Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (Be patient)
Completed Requests
Completed Requests
Finished requests
Server software:tornadoserver/4.2.1
Server Hostname:localhost
Server port:8009
Document Path:/demo/syncsleep-handler/
Document length:15 bytes
Concurrency level:40
Time taken for tests:200.746 seconds
Complete requests:200
Failed requests:0
Total transferred:42000 bytes
HTML transferred:3000 bytes
Requests per second:1.00 [#/sec] (mean)
Time per request:40149.159 [MS] (mean)
Time/request:1003.729 [MS] (mean, across all concurrent requests)
Transfer rate:0.20 [Kbytes/sec] Received
Connection Times (MS)
Min MEAN[+/-SD] Median max
connect:0 0 0.2 0 1
processing:1005 36235 18692.2 38133 200745
waiting:1005 36234 18692.2 38133 200745
total:1006 36235 18692.2 38133 200746
Percentage of the requests served within a certain time (MS)
50% 38133
66% 38137
75% 38142
80% 38161
90% 38171
95% 38176
98% 38179
99% 199742
100% 200746 (Longest request)
Asynchronous test results
/Ab-n 200-c http://localhost:8009/demo/sleep-handler/
This is apachebench, Version 2.3 < $Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to the Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (Be patient)
Completed Requests
Completed Requests
Finished requests
Server software:tornadoserver/4.2.1
Server Hostname:localhost
Server port:8009
Document Path:/demo/sleep-handler/
Document length:15 bytes
Concurrency level:40
Time taken for tests:5.083 seconds
Complete requests:200
Failed requests:0
Total transferred:42000 bytes
HTML transferred:3000 bytes
Requests per second:39.35 [#/sec] (mean)
Time per request:1016.611 [MS] (mean)
Time/request:25.415 [MS] (mean, across all concurrent requests)
Transfer rate:8.07 [Kbytes/sec] Received
Connection Times (MS)
Min MEAN[+/-SD] Median max
connect:0 0 0.4 0 2
PROCESSING:1001 1010 12.0 1005 1053
WAITING:1001 1010 12.0 1005 1053
TOTAL:1001 1010 12.3 1005 1055
Percentage of the requests served within a certain time (MS)
50% 1005
66% 1009
75% 1011
80% 1015
90% 1032
95% 1044
98% 1045
99% 1054
100% 1055 (Longest request)
Results comparison
In a simple stress test with a concurrent volume of 40 and a total request volume of 200, the performance of the two network IO models is compared as follows:
Synchronous and asynchronous performance comparisons
Performance Index |
Synchronous Blocking Type |
asynchronous non-blocking |
Number of requests processed per second (Requests/second) |
1 |
39 |
Request average wait time-ms (times per Request,mean) |
40149 |
1017 |
Request average processing time-ms (times per Request,across all) |
1003 |
25 |
The results of the test are in accordance with the theoretical expectations of the test program, because the test program is the function is: a 1s delay wait.
It is obvious that asynchronous non-blocking and performance are much higher than synchronous blocking type.
In the synchronized IO model data in the previous table: As long as you enter the processing of a single request, into the sleep waiting for the kernel state operation, will block the entire process, the other program can only enter the waiting state, which is essentially the use of the serial processing, so the average processing time request is about 1000ms ( 1 seconds), and then complete a request with a concurrency degree of 40 for an average wait time of 40149ms.
The understanding of the above parameters can be explained by simple analogy.
Take the following scenario as an example: The client goes to the bank to transact business with the window.
Degree of parallelism: Number of service windows opened by banks and front desk attendants
Corresponding to the CPU, the number of Windows corresponds to the core number, that is, the real ability to achieve parallelism, that is not in the time fragmented after the staggered "illusion parallel"
Concurrency: The number of service windows waiting in the lobby
corresponding to a single degree of concurrency, that is, the task required to deal with the amount of work
Total requests: From outside the Bank Hall. Cumulative number of customers joining the lobby team
Kernel operation: Operations that must be handled only by the front desk attendant in the banking business
User-State operations: The customer's own work, such as: Prepare their own ID card, to the outside to copy documents, telephone and company colleagues to confirm information and so on.
So the conceptual analogy for synchronous and asynchronous is as follows:
Synchronous blocking system: The bank has no queuing system, the client (Web server process) can only be in the ranks of the crowd silly waiting for their turn, not in the queue time to do other things. As the outside people continue to enter the hall, the new request for everyone to wait until the front of the entire processing of the team (40149MS) to wait until the salesman (CPU) spend 1003ms to deal with their business
Asynchronous non-blocking System: The bank has queuing system, customers can not be in a crowded crowd silly, and so on the rest of the area open to deal with other things. Customers directly to collect the name of the document, spend 5ms to submit the preparation of materials (the initiation of the kernel operation request) or send and receive mail, or see the small film, and then call the system called itself, immediately up to 20ms time to solve the problem. Customers actually waste the time on this above 25ms, of course, the bank clerk (CPU) or to spend 1000ms to deal with this task
In this hypothetical scenario, both synchronous and asynchronous, the salesman (CPU) is full load of work, but it greatly saves the customer (Web server process) time. This way the customer itself can wait for the salesman to respond to the time used to do some other work, so that greatly improve the overall efficiency.
As we all know, Python has Gil, so multithreading is actually pseudo multithreading. Tornado so that a single process with only single-threaded, do not do thread switching, but also to achieve the parallel way, all use asynchronous. As long as a request enters a kernel-time, time-consuming IO operation, the tornado main process does whatever it takes to initiate kernel IO initialization, and immediately returns to the Web monitoring to respond to other requests. After the IO of the kernel state is completed, then the callback to the user state's main process processing result. If you are using a synchronization model, if you are using a single process multithreading, you will incur the overhead of thread switching, and if you use a single process single-threaded (like Django), if one request is more time-consuming, the second person's request is queued, and the Web service process is mostly blocked. The performance is greatly reduced.
Finally, with the previous example of delay 1s, add an example of an instant response interface:
Class Justnowhandler (Tornado.web.RequestHandler):
def get (self):
Self.write ("I hope just now")
Interested students can do their own experiments. Prior agreement:
synchronization delay 1s interface is: A
The asynchronous delay 1s interface is: B
The interface for instant response is: C
Run the Web server using single core mode.
Then run the program request interface in a different order in the browser:
We'll delay it immediately.
C and A: the total is 1s after the response completed C and A,c immediately response
C and B: the total is 1s after the response completed C and B,c immediately response
Delay first and then instant
First A and C: total is 1s after the response completed C and a,c must wait a processing, before the 1s response
B and C: the total is 1s after the response completed C and B,c can respond immediately
In the synchronization model, once the process is blocked, the efficiency of the program is severely reduced by the time it waits.
Summarize
Interested students, you can further study the UNIX network Programming-Volume 1, Socket Networking API (W.richard Stevens) 6th Chapter 2nd I/O model.
Within the Python web framework, Tornado is the most efficient asynchronous non-blocking framework to offer high-performance Web application services in the Python language.
Lightweight Web server Tornado code Analysis
A recent project, using the Tornado Web server framework written in Python by Facebook. This is really a lightweight framework that requires just a few lines of code to run the script directly, and you can build a server. Tornado uses the Epoll way, in the Linux environment, uses the Epoll, always receives more attention, hehe. This is more efficient than the poll model used in the C + + Poco library where we sell SMS items. Coupled with the powerful scripting capabilities of the Python language itself, grayscale publishing is very small in terms of code lines. Without parentheses and rely on the indentation of the rules, making tornado source code looks more comfortable, and they are all from the hands of cattle, is really learning Network programming good object.
Multi-process + non-blocking + epoll model is adopted in tornado, which can provide more powerful network response performance. In our project, the Grayscale publishing server for a single instance can support request responses 1500 times per second. With the Nginx and Tornado deployment, can support the operation of multiple instances at the same time, so as to support the double request response, to meet the current demand for the upgrade of Wang-Wang users. The following figure is a map of the grayscale distribution of Wang Wang:
Now put some of the contents of the tornado, as well as some important information to share with you, interested students can play.
1 Tornado History
Tornado is an open source Web server framework that is based on the development of real-time information services FriendFeed the social aggregation web site. In 2007, 4 former Google software engineers co-founded the FriendFeed, designed to make it easy for users to track their friends ' activities on multiple social networking sites such as Facebook and Twitter. Two years later, Facebook announced a takeover of FriendFeed, which was priced at about 50 million dollars. At this point, FriendFeed only 12 employees. It is said that the gang later came to Google, and made the current Google App Engine ...
Tornado, written by Python, differs from other mainstream Web server frameworks by using Epoll non-blocking IO, which responds quickly, handles thousands of concurrent connections, and is especially useful for real-time web services. Tornado current version is 2.1.1, the official website for http://www.tornadoweb.org/, interested students can go to try.
2 Tornado Profile
Tornado mainly includes the following four parts. The official help document is actually just a collection of source code comments. We can see the source directly.
Core Web Framework
Tornado.web-requesthandler and Application Classes
tornado.httpserver-non-blocking HTTP Server
Tornado.template-flexible Output Generation
Tornado.escape-escaping and string manipulation
Tornado.locale-internationalization Support
Asynchronous networking
Tornado.ioloop-main Event Loop
Tornado.iostream-convenient Wrappers for non-blocking sockets
tornado.httpclient-non-blocking HTTP Client
Tornado.netutil-miscellaneous Network Utilities
Integration with other services
Tornado.auth-third-party Login with OpenID and OAuth
Tornado.database-simple MySQL Client Wrapper
Tornado.platform.twisted-run code written for twisted on tornado
Tornado.websocket-bidirectional Communication to the browser
Tornado.wsgi-interoperability with the other Python frameworks and servers
Utilities
Tornado.autoreload-automatically detect code changes in development
Tornado.gen-simplify Asynchronous Code
Tornado.httputil-manipulate HTTP Headers and URLs
Tornado.options-command-line parsing
Tornado.process-utilities for multiple processes
Tornado.stack_context-exception Handling across asynchronous callbacks
Tornado.testing-unit testing support for asynchronous code
Today, I will share with you the content of HTTP server.
2.1Tornado HTTP SERVER
With tornado, you can easily structure various types of Web servers. Let's start with the HTTP server and look at its implementation. The following figure should be seen a lot, is the general way of working for all Web servers.
L Server bind to a port, then start listen.
L When the client connect comes up, send the request to the service side.
L The service-side processing is completed and returned to the client.
In this way, a request is processed at the end. However, when we need to deal with thousands of connections, we will consider more situations on this basis. This is also the familiar c10k problem. Generally, you will have the following choices:
L A thread services multiple clients, using non-blocking I/O and level-triggered readiness notifications
L One thread service multiple clients, notify when using non-blocking I/O and ready change
L A service thread service multiple clients, using asynchronous I/O
L A Service thread services a client, using blocking I/O
L Compile the service code into the kernel
Tornado: Multi-process + non-blocking + epoll model
The following diagram basically shows all the tornado related to the Web:
2.2 First HTTP Server example
Here is an example of a Hello World code provided by the official website.
Import Tornado.ioloop
Import Tornado.web
Class MainHandler (Tornado.web.RequestHandler):
def get (self):
Self.write ("Hello, World")
application = Tornado.web.Application ([
(r "/", MainHandler),
])
if __name__ = = "__main__":
Application.listen (8888)
Tornado.ioloop.IOLoop.instance (). Start ()
The implementation is very simple, only need to define their own processing methods, the other things all to tornado complete. First create the Web application and pass our processing methods MainHandler past. And then start listening at 8888. Finally, start the event loop, start listening to network events, mainly the socket read and write. Python is such a convenient language, the above code directly affixed to the text, no need to compile, you can run directly, a server is generated.
2.3 Module Analysis
We will then analyze this part of the code one by one. First of all, there is a comprehensive understanding of tornado. The tornado server has 3 core modules:
(1) Ioloop
From the above code, it may be seen that tornado in order to achieve high concurrency and high performance, using a ioloop to handle the socket read and write events, Ioloop based on Epoll, can efficiently respond to network events. This is the guarantee of tornado efficiency.
(2) IOStream
In order to realize the asynchronous reading and writing to the socket when processing the request, Tornado implements the Iostream class to handle the asynchronous reading and writing of the socket.
(3) Httpconnection
This class is used to process HTTP requests, including reading the HTTP request headers, reading the post data, invoking the user-defined processing method, and writing the response data to the client socket.
The following diagram depicts the general processing process for the Tornado server, and then we will analyze the implementation of each step of the process in detail.
3 Source Analysis
3.1 bind and listen
The first step in the server is bind. The httpserver.py bind function can see a standard server startup process:
def bind (self, port, Address=none, Family=socket.af_unspec):
if address = = "":
Address = None
Find Network card information
For RES in Socket.getaddrinfo (address, port, family, socket. Sock_stream,
0, Socket. ai_passive | Socket. Ai_addrconfig):
AF, Socktype, Proto, canonname, sockaddr = Res
Sock = Socket.socket (AF, Socktype, Proto)
Flags = FCNTL.FCNTL (Sock.fileno (), Fcntl. F_GETFD)
Flags |= fcntl. Fd_cloexec
Fcntl.fcntl (Sock.fileno (), Fcntl. F_SETFD, Flags)
Sock.setsockopt (socket. Sol_socket, SOCKET. SO_REUSEADDR, 1)
If af = = Socket.af_inet6:
If hasattr (socket, "Ipproto_ipv6"):
Sock.setsockopt (socket. Ipproto_ipv6, Socket. Ipv6_v6only, 1)
Sock.setblocking (0)
Bind and listen
Sock.bind (SOCKADDR)
Sock.listen (128)
Self._sockets[sock.fileno ()] = sock
Join Io_loop
If self._started:
Self.io_loop.add_handler (Sock.fileno (), Self._handle_events,
Ioloop. Ioloop.read)
The For Loop ensures that requests on each Web card are monitored. For each NIC, first establish the socket, then bind listen, and finally the socket to Io_loop, registered event is ioloop. Ioloop.read, which is reading events. The process of IPv6 is also added to the program. The callback function is _handle_events, once the listen socket is readable, the client request arrives, and then the _handle_events accepts the client's request. Next, take a look at how _handle_events is handled.
3.2 Accept
In the next section, the httpserver.py _handle_events function implements the accept process. The code is as follows:
def _handle_events (self, FD, events):
While True:
Try
Connection, address = self._sockets[fd].accept ()
Except Socket.error, E:
If E.args[0] in (errno. Ewouldblock, errno. Eagain):
Return
Raise
If Self.ssl_options is not None:
Here is a section of code to handle SSL, rather long, omitting
Try
stream = iostream. IOStream (connection, Io_loop=self.io_loop)
Httpconnection (stream, address, self.request_callback,
Self.no_keep_alive, Self.xheaders)
Except
Logging.error ("Error in connection callback", Exc_info=true)
The Accept method returns the socket of the client and the address of the client. The Iostream object is then created to handle asynchronous reads and writes to the socket. This step calls Ioloop.add_handler to join the client socket Ioloop, and then create the httpconnection to handle the user's request. Next, let's look at iostream and httpconnection.
3.3 iostream
In order to achieve asynchronous read and write to the client socket, you need to create two buffers for the client socket: _read_buffer and _write_buffer, so that we do not have to read and write the socket directly, so that we can read and write asynchronously. These operations are encapsulated in the Iostream class. Generally speaking, iostream the socket to read and write a layer of encapsulation, through the use of two buffers, the socket to achieve asynchronous read and write.
Iostream is one by one corresponding to the socket, and the Init method of Iosteram can be found in iosteram.py:
def __init__ (self, socket, Io_loop=none, max_buffer_size=104857600,
read_chunk_size=4096):
Self.socket = socket
Self.socket.setblocking (False)
Self.io_loop = Io_loop or Ioloop. Ioloop.instance ()
Self._read_buffer = Collections.deque ()
Self._write_buffer = Collections.deque ()
Self._state = Self.io_loop. ERROR
With Stack_context. Nullcontext ():
Self.io_loop.add_handler (
Self.socket.fileno (), self._handle_events, Self._state)
You can see that the initialization of the time set up two buffer, and then put their socket into the io_loop. In this way, when the socket has read and write, it will be recalled to the registered event self._handle_events. _handle_events is easy to understand, the code is as follows:
def _handle_events (self, FD, events):
If not self.socket:
Logging.warning ("Got events for closed stream%d", FD)
Return
Try
If events & Self.io_loop. READ:
Self._handle_read ()
If events & Self.io_loop. WRITE:
Self._handle_write ()
If events & Self.io_loop. ERROR:
Self.io_loop.add_callback (Self.close)
Return
state = Self.io_loop. ERROR
If Self.reading ():
State |= Self.io_loop. READ
If Self.writing ():
State |= Self.io_loop. WRITE
If state!= self._state:
Self._state = State
Self.io_loop.update_handler (Self.socket.fileno (), self._state)
Except
Logging.error ("uncaught exception, closing connection."),
Exc_info=true)
Self.close ()
Raise
3.4 Ioloop
In the tornado server, Ioloop is the core of the scheduling module, Tornado server back to all the socket descriptor registered to the Ioloop, register the callback processing function, Ioloop internal monitoring IO events, once found that a socket can read and write , the callback function specified when its registration is invoked. Ioloop uses a single case pattern.
In the entire process of tornado running, there is only one ioloop instance, and only one Ioloop instance is required to handle all IO events. Ioloop has been used many times in the above. Ioloop.instance () This method. It will return a single example of Ioloop.
The following code lets you see how Python defines a single example. The CLS is used in the code, which is not a keyword, like self, the CLS is a built-in variable of Python, self represents an instance of a class, and the CLS represents a class. So you see a couple of functions that show that the first argument of Python's member function is not self or CLS.
Class Ioloop (object):
def instance (CLS):
If not hasattr (CLS, "_instance"):
Cls._instance = CLS ()
Return cls._instance
def initialized (CLS):
Return hasattr (CLS, "_instance")
def start (self):
If self._stopped:
self._stopped = False
Return
self._running = True
While True:
Poll_timeout = 0.2
Callbacks = Self._callbacks
Self._callbacks = []
For callback in callbacks:
Self._run_callback (callback)
Try
Event_pairs = Self._impl.poll (poll_timeout)
Except Exception, E:
if (GetAttr (E, ' errno ', None) = = errno. Eintr or
(Isinstance (GetAttr (E, ' args ', None), tuple) and
Len (E.args) = = 2 and e.args[0] = = errno. EINTR)):
Continue
Else
Raise
If Self._blocking_signal_threshold is not None:
Signal.setitimer (signal. Itimer_real,
Self._blocking_signal_threshold, 0)
Self._events.update (Event_pairs)
While self._events:
FD, events = Self._events.popitem ()
SELF._HANDLERS[FD] (FD, events)
self._stopped = False
If Self._blocking_signal_threshold is not None:
Signal.setitimer (signal. Itimer_real, 0, 0)
The poll here supports the Select, Epoll, and kqueue three modes based on different system environments. The following is the processing of the Epoll mode:
Class _epoll (object):
_epoll_ctl_add = 1
_epoll_ctl_del = 2
_epoll_ctl_mod = 3
def __init__ (self):
SELF._EPOLL_FD = Epoll.epoll_create ()
def fileno (self):
Return SELF._EPOLL_FD
def register (self, FD, events):
Epoll.epoll_ctl (SELF._EPOLL_FD, Self._epoll_ctl_add, FD, events)
def modify (self, FD, events):
Epoll.epoll_ctl (SELF._EPOLL_FD, Self._epoll_ctl_mod, FD, events)
def unregister (self, FD):
Epoll.epoll_ctl (SELF._EPOLL_FD, Self._epoll_ctl_del, FD, 0)
Def poll (self, timeout):
Return epoll.epoll_wait (self._epoll_fd, int (timeout * 1000))
4 Performance Comparison
This is a section of the official web description:
"The performance of a WEB application depends primarily on its overall architecture, not just the performance of the front-end." Tornado is much faster than other Python Web frameworks. We did a test on some of the popular Python Web frameworks (Django, web.py, cherrypy) for the simplest Hello, world example. For Django and web.py, we use Apache/mod_wsgi to take, cherrypy let it run naked. This is also a common deployment scenario for each framework in a production environment. For our Tornado, the deployment scenario used is to use Nginx as a reverse proxy for the front-end, and to drive the 4-thread-mode Tornado, which is also our recommended Tornado deployment scenario in a production environment (depending on the hardware, we recommend a CPU core that corresponds to a Tornado Servo instance, our load test uses a quad-core processor. We used Apache Benchmark (AB) to load tests on another machine using the following instructions:
Ab-n 100000-c http://10.0.1.x/
On the AMD Opteron 2.4GHz four nuclear machines, the results are shown in the following illustration:
In our tests, Tornado is 4 times times more likely to perform data than the second-fastest server. Even with only one CPU core of the bare running mode, Tornado also has a 33% advantage.
uses the same parameters to publish the server test results for the Wang-Mong grayscale:
ab-n 20000-c ' http://10.20.147.160:8080/redirect?uid= Cnalichntest&ver=6.05.10&ctx=alitalk&site=cnalichn '
To configure Nginx + 1 Tornado servers: Requests per second: 672.55 [#/sec] (mean)
Configure Nginx + 4 Tornado servers: Requests per second: 2187.45 [#/sec] (mean)