Evolution of Python process technology

Source: Internet
Author: User
Tags epoll
This is a creation in Article, where the information may have evolved or changed.

Introduction

1.1. Memory Mountain

Memory Mountain is Randal Bryant in the "in-depth understanding of computer systems," the concept proposed in the book.

Based on cost and efficiency considerations, computer memory is designed as a multilevel pyramid structure, the top of the tower is the fastest, most expensive CPU internal registers (typically several KB) and cache, tower bottom is the lowest cost, the slowest WAN cloud storage (such as Baidu Cloud free 2T)


The guiding significance of Memory Hill lies in revealing that good design procedures are necessary to have excellent locality:

time locality : In the same time, the more access to the same address, the better the performance of time locality;

Spatial locality : The next access memory address is adjacent to the last visited memory address;

1.2. CPU view of Time

Operation Real Latency CPU Experience

Finger 0.38ns1s

Read L1 cache 0.5ns1.3s

Branch Error Correction 5ns13s

Read L2 cache 7ns18.2s

Add mutex lock 25ns1min 5s

Memory Addressing 100ns4min 20s

Context Switch/system call 1.5US1H 5min

1Gbps Network transmission 2KB Data 20us14.4h

Take 1M data block from Ram 250us7.5day

Ping single IDC host 500us15day

Read 1M data from SSD 1ms1month

Read 1M data from HDD 20ms20month

Ping different City host Ms12.5year

Virtual Machine Restart 4s300year

Server Restart 5min25000year

We scaled the latency of a normal 2.6GHz CPU to the scale that people can experience (data comes from the public): the time to execute a single register instruction at the top of the memory is 1 seconds, and the 1MB data from the fifth disk requires 1.5; Ping different metropolitan network hosts, network packets need to go 12.5 years.

If the program sends an HTTP packet and then blocks in the process of synchronizing waiting for a response, the computer has to wait for the response 12 years later to deal with something else, and low hardware utilization inevitably leads to low program efficiency.

1.3. Synchronous programming

From the above data can be seen, memory data reading and writing, disk seek read and write, network card read and write operations are I/O operations, synchronization program bottleneck lies in the long I/O wait, want to improve program efficiency must reduce I/O wait time, from the improvement of the program's locality .

The improvement of synchronous programming is multi-process, multi-threading, but for the c10k problem is not a good solution, multi-process mode exists the operating system can be scheduled to a low limit of the number of processes, inter-process context switching time is too long, inter-process communication is more complex.

And Python's multi-threaded way, due to the existence of well-known GIL lock, performance improvement is not stable, only to meet the hundreds of thousands of I/o-intensive tasks, multithreading also has a disadvantage is that the operating system preemptive scheduling existence of race conditions, may need to introduce a lock and queue, such as the protection of atomic operations tools.

1.4. Asynchronous Programming

Speaking of asynchronous non-blocking calls, the present pronoun is epoll and Kqueue,select/poll because the efficiency problem has basically been replaced.

Epoll is an I/O event notification mechanism introduced into the kernel in 04, which is to host a large number of file descriptors to the kernel, which encapsulates the lowest I/O state changes into read-write events, thus avoiding repetitive work by programmers to proactively poll for state changes. The programmer registers the callback function with the state of the epoll, and the function callback occurs when the corresponding file descriptor is detected to produce a state change.

Event loops are the underlying cornerstone of asynchronous programming.


Is the simple implementation principle of EventLoop,

User created two socket connections, the system returned two file descriptor Fd3, FD4 through the system call on the Epoll register read and write events;

When the NIC resolves to a TCP packet, the kernel finds the appropriate file descriptor according to the five-tuple, automatically triggering its corresponding ready event state, and adding the file descriptor to the Ready list.

The program calls Epoll.poll () to return the collection of events that can be read and written.

Polling the collection of events, invoking callback functions, etc.

The cycle of events ends and repeats.

Epoll is not a silver bullet, it can be observed that if the user is very low level of attention, the direct operation of Epoll to construct maintenance events cycle, from the bottom to the high-level business logic needs a layer of callback, resulting in callback hell, and poor readability. So, this cumbersome registration callback and callback process can be encapsulated and abstracted into eventloop. EventLoop masks the specific operation of making a epoll system call. For the user, considering the different I/O status as the trigger of the event, only focus on the callback behavior of the different events at a higher level. A high-performance asynchronous event library, such as Libev, Libevent, written in C, has replaced this trivial task.

These kinds of event loops are typically seen in the Python framework:

Libevent/libev:gevent (pre-greenlet+ libevent, late Libev) use of network library, widely used;

Tornado:tornado framework of its own implementation of the Ioloop;

Picoev:meinheld (Greenlet+picoev) uses the network library, small and lightweight, compared to libevent in the data structure and event detection model has been improved, so faster. But from GitHub it looks like it's been in disrepair for a few years.

The new show of the Uvloop:python3 era. Guido Fencing built the Asyncio library, Asyncio can configure pluggable event loops, but need to meet the relevant API requirements, Uvloop inherit from Libuv, and some lower-level structures and functions are wrapped in Python objects. Currently the Sanic framework is based on this library

1.5. Co-process

EventLoop simplifies event handling on different platforms, but it is still cumbersome to handle callbacks during event triggering, and a responsive asynchronous program writing is a big hassle for the programmer's mind.

Therefore, the process is introduced to replace callbacks to simplify the problem. The co-process model is superior to the callback model in the following ways:

The asynchronous callback pattern is replaced by the programming mode of approximate synchronous code, and the real business logic is often synchronous linear deduction, so this kind of synchronous code is easier to write. The underlying callback is still callback hell, but this part of the dirty dirty has been forwarded to the compiler and interpreter to complete, the programmer is not easy to error.

Exception handling is more robust, can be reused in the language of error handling mechanisms, callback methods. However, the traditional asynchronous callback mode needs to decide its own success failure, and the error handling behavior is complicated.

Context management simplifies, callback mode code context management relies heavily on closures, and different callback functions are coupled to each other, splitting the same context-handling logic. The process directly uses the execution location of the code to represent the state, while the callback maintains a bunch of data structures to handle the state.

Easy to handle the concurrency behavior, the cost of the co-process is very low, each process has only a lightweight user-state stack space.

1.6. The history of EventLoop and co-process

In 04, Event-driven's Nginx was born and spread rapidly, and after 06 it spread from Russian-speaking countries to the world. In the same period, EventLoop became figurative and diversified, and it was implemented in different programming languages successively.

Over the past 10 years, the ancient subroutines in the backend domain have been combined with the event loop, and the Association (collaborative routines) has developed rapidly, and has also revolutionized and spawned some languages, such as Golang's Goroutine,luajit Coroutine,python gevent, Erlang's Process,scala actor and so on.

In terms of the concurrent design-oriented implementation in different languages, the Actor model of Scala and Erlang and the goroutine in Golang are more mature than Python, and the different processes use communication to share memory and optimize the problems of race, conflict and inconsistency. However, there is no difference between the fundamental concepts, which are implemented by the event loop driven by the user state.

Due to the less historical burden, the various asynchronous techniques in the back-end language have no callback hell except Python Twisted. Other schemes have encapsulated the callback hell process, which is given to library code, compilers, and interpreters to solve.

With the association process, with the event Loop library, the traditional c10k problem is no longer a challenge and has risen to the c1m problem.

2. Gevent


Python2 era of the process of technology is mainly gevent, another meinheld relatively small audience. Gevent has a praise, negative view that its implementation is not enough pythonic, out of the interpreter alone to implement the black box Scheduler, Monkey Patch let not understand the user confusion. The positive view is that it is the only way to block all the details and simplify the use.

Gevent based on Greenlet and Libev,greenlet is a kind of micro-threading or co-process, which is larger than PY3 in scheduling granularity. Greenlet exists in the thread container, behaves like a thread, has its own independent stack space, and switches between different greenlet similar to the OS layer's thread switching.

Greenlet.hub is also an object that inherits from native Greenlet and is also the parent node of other Greenlet, which is responsible for task scheduling. When a greenlet process finishes executing some of the routines, it reaches the breakpoint and passes the control to the hub object through Greenlet.switch (), the hub performs the context switch operation: Backs up the current Greenlet stack contents from the register, the cache, into memory, and restore another Greenlet stack data from the original backup to the register.

A loop object is encapsulated within the hub object, and loop is responsible for encapsulating the related operations of the Libev and providing an interface upward, all of which are dispatched under the hub driven by the loop Greenlet.

3. From yield to async/await

3.1. Evolution of the generator

In Python2.2, the generator is introduced for the first time, and the generator implements a lazy, multiple-valued method, either by creating an iteration chain or next using the next construct to take multiple values.

Until the yield keyword is added to the syntax in Python2.5, the generator has a memory function and the next time the value from the generator can be restored to the location where the generator last yield was performed.

The previous generators were all about how to construct iterators, and in Python2.5, the generator also added the Send method, which is used in conjunction with yield.

We found that at this point the generator can not only yield to a state, it can also change its state by passing a value in the Send method where it stopped.

To give a simple example, be familiar with the process of yield and send interacting with the outside world:

def jump_range (up_to):

Step = 0

While step < up_to:

Jump = yield Step

Print ("Jump", jump)

If jump IsNone:

Jump = 1

Step + = Jump

Print ("step", Step)

if __name__ = = ' __main__ ':

iterator = Jump_range (10)

Print (Next (iterator)) # 0

Print (Iterator.send (4)) # jump4; Step4; 4

Print (Next (iterator)) # jump None; STEP5; 5

Print (Iterator.send ( -1)) # jump-1; Step4; 4

In Python3.3, the generator also introduces the yield from keyword, and the yield from implements the ability to invoke another generator within the generator, which can easily refactor generators, such as connecting multiple generators together for execution.

Def gen_3 ():

Yield3

Def gen_234 ():

Yield2

Yieldfrom Gen_3 ()

Yield4

def main ():

Yield1

Yieldfrom gen_234 ()

Yield5

For element in main ():

Print (Element) # 1,2,3,4,5


You can see the characteristics of yield from. The itertools.chain can be combined with a generator for a minimal combination of subgroups, using itertools.cycle to reach the end of a single generator and construct a chain of loops.

With yield from, you can yield a value from other generators in the generator so that different generators can communicate with each other, so that the resulting chain is more complex, but the granularity of the smallest composition of the generation chain is fine to a single yield object.

3.2. Short Asynico.coroutine and yield from

With the yield from this tool introduced in Python3.3, a new Asyncio library is added to the Python3.4 and a default event loop is provided. Python3.4 has enough basic tools for asynchronous concurrent programming.

concurrent programming executes multiple independent logical streams at the same time, each with a separate stack space , even if they are all working in the same thread. The following is a sample code:

Import Asyncio

Import Aiohttp

@asyncio. coroutine

Def fetch_page (Session, URL):

Response = Yieldfrom session.get (URL)

if Response.Status = = 200:

Text = Yieldfrom Response.text ()

Print (text)

loop = Asyncio.get_event_loop ()

Session = Aiohttp. Clientsession (Loop=loop)

tasks = [

Asyncio.ensure_future (

Fetch_page (Session, "http://bigsec.com/products/redq/")),

Asyncio.ensure_future (

Fetch_page (Session, "http://bigsec.com/products/warden/"))

]

Loop.run_until_complete (asyncio.wait (Tasks))

Session.close ()

Loop.close ()

In Python3.4, the Asyncio.coroutine adorner is the syntax used to convert a function to a co-process, which is the first time that Python provides the generator coprocessor. only through this adorner can the generator implement the co-process interface . When using the co-process, you need to use the yield from keyword to place a asyncio. The future object is passed down to the event loop, and the process is temporarily suspended to handle other tasks when it is not yet ready. Once the future object is complete, the event loop will detect a state change, returning the result of the future object to the generator coprocessor through the Send method method, and then the generator resumes its work.

In the example code above, you first instantiate a eventloop and pass it to aiohttp. Clientsession is used so that the session does not have to create its own event loop.

There are two tasks explicitly created here, and only when Fetch_page obtains data api.bigsec.com two URLs and prints it, all tasks can be completed, then the session and loop are closed and the connection resources are freed.

When the code runs to response = yield from session.get (URL), the fetch_page is suspended, implicitly passing a future object to the event loop, and the task is completed only when Session.get () is complete.

The Session.get () is also a co-process that transmits data in the slowest network layer on the memory hill. When the Session.get is complete, a response object is obtained, which is then passed to the original Fetch_page generator to restore its working state.

To increase the speed, the get method here will get the HTTP header and body decomposed into two tasks , reducing the amount of data transmitted at one time. Response.text () is an asynchronous request for HTTP body.

Using the DIS library to view the bytecode of the Fetch_page, Get_yield_from_iter is the operation code for YIELD from:

In [4]: Import dis

In [5]: Dis.dis (fetch_page)

0 Load_fast 0 (session)

2 Load_attr 0 (GET)

4 load_fast 1 (URL)

6 Call_function 1

8 Get_yield_from_iter

Ten Load_const 0 (None)

Yield_from

Store_fast 2 (response)

2 Load_fast (response)

Load_attr 1 (status)

Load_const 1 (200)

Compare_op 2 (= =)

48 Pop_jump_if_false

Load_fast 2 (response)

2 load_attr (text)

0 call_function

Get_yield_from_iter

Load_const 0 (None)

Approx. yield_from

3 Store_fast (text)

3 Load_global (print)

Load_fast 3 (text)

Call_function 1

Pop_top

>> load_const 0 (None)

Return_value

3.3. Async and Await keywords

Python3.5 introduces these two keywords to replace the asyncio.coroutine and yield from, semantically defines the native association keyword, avoids the user to confuse the generator and the generator. This stage (3.0-3.4) uses Python very few people, so the historical burden is not heavy, can carry on some big innovation.

Await behaves like yield from, but they do not wait for the same object asynchronously, and yield from waits for a generator object, and await receives the Awaitable object that defines the __await__ method.

In Python, the Awaitable object is also a Collections.abc.Coroutine object that inherits from Collections.abc.Awaitable.

Therefore, the sample code from the previous section is rewritten as:

Import Asyncio

Import Aiohttp

Asyncdef fetch_page (Session, URL):

Response = await session.get (URL)

if Response.Status = = 200:

Text = await Response.text ()

Print (text)

loop = Asyncio.get_event_loop ()

Session = Aiohttp. Clientsession (Loop=loop)

tasks = [

Asyncio.ensure_future (

Fetch_page (Session, "http://bigsec.com/products/redq/")),

Asyncio.ensure_future (

Fetch_page (Session, "http://bigsec.com/products/warden/"))

]

Loop.run_until_complete (asyncio.wait (Tasks))

Session.close ()

Loop.close ()

From the point of view of Python language development, async/await is not a great improvement, but the introduction of mature semantics in other languages, the cornerstone of the process is the development of the EventLoop library, as well as the completion of the generator. In terms of structural principles, the role of Asyncio is an asynchronous framework, async/await is the API provided for the asynchronous framework, because the consumer is not currently able to write the coprocessor code from Asyncio or other asynchronous libraries using async/await. Even though the user can avoid explicitly instantiating the event loop, such as the Asyncio/await Network library, which supports the curio syntax, the ASYNC/AWAIT keyword itself has no effect but is out of EventLoop's heart-like drive.

4. Use of async/await


4.1. after writing asynchronous code without a callback method, a future object is introduced in order to get the result of the asynchronous call. The future encapsulates the interaction with loop, and the Add_done_callback method registers the callback function with the Epoll, and when the result property returns a value, it runs the previously registered callback function and passes it up to coroutine. However, each role has its own responsibilities, using the future to send result to the generator to restore the working state is not appropriate, the future object itself has a short life cycle, each registration callback, generate events, trigger callback process after the work has been completed. Therefore, it is necessary to introduce a new object Task to the generator and the future object, and to manage the Generator association state. 4.2. Task

Task, as the name implies, is a task that maintains the execution logic of the generator's process state, and the _step method within the task is responsible for the state migration of the generator and EventLoop interactions: Send a value to the coprocessor, resume its working state, and then run to the breakpoint to get a new future object, Then process the callback registration process for the future and loop.

4.3. Loop

There are some deviations in the way the event loops work and the user's assumptions, and it is a matter of course that each thread can have a separate loop. But in the run, it is possible to create a new loop in the main thread through Asyncio.get_event_loop (), while in other threads, using Get_event_loop () will throw a wrong, and the correct approach should be asyncio.set_event_ Loop () makes an explicit binding of the current thread to the loop. Because loop operates in a way that is not controlled by Python code, it is not possible to steadily extend the process to multi-threaded operations.

When the process is working, it does not know which loop is dispatching it, even if calling Asyncio.get_event_loop () does not necessarily get to the loop that is actually running. Therefore, in various library code, when instantiating an object, you must explicitly pass the current loop for binding.

4.3. Another future

Another future object in Python is Concurrent.futures.Future, with Asyncio. The future is incompatible with each other but prone to confusion. Concurrent.futures is a thread-level future object that is used to pass results between different threads when using Concurrent.futures.Executor for multithreaded programming.

4.4. Difficulties in Asyncio ecological development at present

Since these two keywords were introduced in the Python3.5 in 2014, the development history is short, in the Python2 and Python3 fragmented environment, the establishment of ecological environment is not perfect;

For the consumer, the logic of the hope is to introduce a library and then invoke and fetch the results, without concern for the internal logic of the third-party library. However, when you write asynchronous code using the coprocessor, you need to handle the interaction with the event loop. For asynchronous libraries, the external encapsulation is not as high as that of the synchronous library. When programming asynchronously, a user typically selects only one third-party library to handle all HTTP logic. But different asynchronous implementation methods are inconsistent, incompatible, and the differences hinder the growth of the community;

Asynchronous code is fast, but cannot block, once the entire program is blocked from failing. It is not a self-protection to use multi-threading or multi-process to give dispatching right to the operating system;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.