Basic distinguish between web.py / flup and tornado web process handling model (TBC)

最後更新：2018-12-04 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

Tornado is known for its capability of handling concurrent connections with help of OS event triggering mechanisms like epoll and kqueue.

Web.py is a web framework for Python. It relies on other server packages to serve as a complete web server software.

When trying to setup, Tornado could be put to work on its own, while common setup is to put behind an nginx server (via proxy_pass) for handling static resources and other matters while leave Tornado to deal with dynamic requests after reverse proxy.

In contrast, web.py usually requires flup to run as a FastCGI service and then is connected to nginx via fastcgi_pass derivatives.

Appears to a new user they are similar to some extend. I wrote a few very simple scripts [1] / [2] and tested them in the same server running behind the same nginx configuration. Each running two processes (web.py via spawn-fcgi, tornado via tornado.process.fork_processes)
and returning simple string within GET handler. In average nginx + tornado gives 75 - 125ms serve time per requests whilenginx + web.py at `3sec per request, both at 50 concurrent clients (ab -c50). With less concurrent clients
the time difference may be up to 10 times even.

Then I added minor delay in the GET handler for both scripts (with time.sleep (0.1)) to simulate some system processing time. I was dealing with relatively time consuming filesystem requests with my web service before starting looking at both solutions and
therefore this simulation is quite similar to the kind of prolem I am looking at. Surprisingly, thenginx + tornado script slowed to 5sec+ per request and is much, much slower than web.py.

I understood how Tornado works, based on my understanding of epoll / IO multiplexing theories. However since web.py is kind of a mistery I had to look into the source code. Then I saw that the web.py snippet called into flup for creating an "runwsgi" function,
which in tern creates an ThreadedServer within flup. ThreadedServer had an addJob method which is so familiar looking, and within minute I could see that, for each client socket returned from the select call (ThreadedServer.run), a new "job" hence a new thread
in pool is created. Legendary one thread per client model. Even without looking at how web.py (and my code) was called back from flup, I know that:

for those blocking calls (either blocking I/O operations or other matters like the time.sleep call here) are handled by threads / OS scheduler
with large amount of simple, non-blocking / once-off requests, they must be slower than epoll approach.

However when blocking operations appear (such as my sleep call, filesystems, DB calls, etc), epoll will NOT help. OS will wait for such operations to finish before returning to the script. Since there are only two Tornado processes running, there can only be
no more than two instances of clients being served at the same time, even both are sleeping. With flup, threads are created and scheduled by the OS therefore they could be scheduled to run as long as CPU isn't completely hogged.

If we look at the packages available to Tornado, apart from the server package, there are http client packages, async Mongodb packages and some authentication packages built around the http client package. We could clearly see that, to better utilized Tornado,
application need to better use the epoll / IOLoop as the core of application. Tornado framework handles all network waiting time (using epoll) and carefully crafted apps would then response to all events in a timely manner. It's very different from the traditional
CGI style of request handling, but it's definitely towards the right direction.

Issues left over:

Tornado didn't have async MySQL package available and FriendFeed (original author) mentioned [3] that
We experimented with different async DB approaches, but settled on
synchronous at FriendFeed because generally if our DB queries were
backlogging our requests, our backends couldn't scale to the load
anyway. Things that were slow enough were abstracted to separate
backend services which we fetched asynchronously via the async HTTP
module.
Question: how to better arrange resources to run other services to handling blocking services? Upon what principles design decitions should be made?

When testing response speed of Tornado raw (without nginx) using ab shipped with OS X ML request failed from time to time. Saw mentioning that these are caused by bugs in the version of ab shipped with OS X. Should re-test with palb (python implemetation
of ab) or other implementations.

Bug: http://simon.heimlicher.com/articles/2012/07/08/fix-apache-bench-ab-on-os-x-lion

Test with palb, with or without set_header ('Connection', 'Keep-Alive') such conn reset errors is not presenting.

Nginx speaks HTTP/1.0 when used as (reverse) proxy server, which closes connection upon each request. How does this affect the performance of Tornado server? I suppose epoll is designed for comet usages (large number of stale connections)?

Answer: Nginx actually support HTTP/1.1 and Keepalive for upstream proxy settings. See http://nginx.org/en/docs/http/ngx_http_upstream_module.html#keepalive

[4] mentioned using HAProxy instead of nginx. Might worth looking at.

With the Tornado code specified, even there are two Python processes running after starting the server, there is only one accepting requests. Possible solutions: may still need to manage running on two ports and load balance with nginx but it's not ideal.
fork_processes model should have made its way around this problem.

Solution:

fork_process (0) / start (0) creates worker processes based on CPU number in system. Observed two python processes means only one worker process is created - therefore only one process is running request handlers. Testing VM was a single core system. Specifying
start(2) results in 3 python processes and two are sharing the load.

Links:

[1] Web.py test script: https://gist.github.com/4371628

[2] Tornado test script: https://gist.github.com/4363542

[3] http://news.ycombinator.com/item?id=3025475

[4] "Need help on putting tornado apps on production", great info packed - https://groups.google.com/forum/?fromgroups=#!topic/python-tornado/62TLw_gmp94

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic distinguish between web.py / flup and tornado web process handling model (TBC)

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support