High Performance network programming 6--reactor reactor and timer management

Last Update:2015-08-19 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The reactor development model is chosen by most high-performance servers, and the IO multiplexing described in the previous article is the basis for its implementation. Timing triggering is usually a server prerequisite, and the reactor model often has to include the management of the timer. This article will introduce the characteristics and usage of the reactor model.

First we need to talk about why the network programming community needs reactors. With IO multiplexing, with Epoll, we've been able to make the server concurrent with hundreds of thousands of connections while maintaining high TPS, isn't that enough?

My answer is that the technical level is sufficient, but not at the software engineering level.

Where is the difficulty of using IO multiplexing for the program? Although 1 requests are completed by multiple IO processing, IO multiplexing is not natural in human brain thinking compared to traditional single-threaded full-process request lifetimes, because programmer programming, when processing request a, assumes that a request must go through multiple IO operations A1-an (two IO may be a long time interval). Every IO operation, and then call IO multiplexing, the IO multiplexing call returns, very likely no longer has a, but returns request B. That is, request a is often interrupted by the request B, and when the request B is processed, it is interrupted by C. This kind of thinking, programming error-prone.

The image of the said, the traditional programming method seems to be in the banking business Hall, each window front row of long team, the clerk in the window after a resolution of customer requests. A salesman can think about customer a in order to ask questions, such as:

"I want to buy 20,000 XX financial products." “

"See Clearly, 50,000 from the sale." ”

"Wait, check my current balance." ”

"Balance 50,000. ”

"Then buy 50,000." ”

Clerk began to enter information.

"Right, XX financial Products annual interest rate of 8%?" ”

"is expected 8%, lowest interest-free capital preservation. “

"Long time no say, bye, I go to buy balance treasure." “

Salesman without expression of the deleted information has been entered into the transaction rollback.

Next “

Using the IO multiplexing is the master salesman began to challenge the limit, in the super-large business hall to give customers a brand, a dark mass of customers in the lobby, a problem when the application of the card to ask questions, the master of the eye sharp name designated someone to ask questions, the customer quickly get the master's reply, after a period of time to think, check their Consult with LD to proceed to the next question until you get a complete satisfaction reply exiting the Hall. For example: The master just instructed a to fill out a transfer slip of an item, B again to apply for the Thai baht, to the B exchange orders, C again to handle the fixed transfer, and then D and F in the scramble for a limited ballpoint pen when there is disharmony phenomenon, the master stopped the business, temporarily waiting.

This is the event-driven IO multiplexing program compared to the traditional 1 thread 1 request way, a difficult point of design, customers are God, can not make mistakes, but not favoritism.

When there is no reactor, we can design the method is this: The Master of each customer's questions are recorded, when customer a asks, first to look at a before asked what has done, this is called the context, and then according to the context and current questions to consult the relevant banking rules and regulations, a targeted answer to a, and write down the answers. When you have answered all questions of a satisfactorily, delete all records of a.

Back to the code farming career, that is, a moment, the server has a total of 100,000 concurrent connections, at this time, the call of an IO multiplexing interface returned 100 active connection waiting processing. It is not difficult to find the corresponding object based on these 100 connections, which can be used in Epoll's return connection data structure. Then, the loop processing each connection, find out the context state of the object at the moment, and then use read, write such network io to get this operation content, combined with context state query at this time should choose which business method processing, call the corresponding method to complete the operation, if the request ends, The object and its context are deleted.

In this way, we are caught in the process-oriented programming approach, in the application-oriented, fast response to the king of the mobile internet era, so sooner or later to play their own death. Our main program needs to focus on a variety of different types of requests, in different states, for different request commands to choose different business processing methods. This leads to the increase of request type, the increase of request State, the increase of request command, the rapid expansion of the complexity of the main program, and the difficulty of maintenance, which makes it more and more difficult for the programmer to take new demands and reconstruct easily.

The reactor is a way to solve the above software engineering problem, it may not be elegant, the development efficiency is not the highest, but its execution efficiency and process-oriented use of IO multiplexing is almost equivalent, so, whether it is Nginx, memcached, Redis and so on these high-performance components synonymous, All at once into the arms of the reactor.

The reactor model can separate the event-driven framework from the specific business at the software engineering level, separating the different types of requests with oo ideas. Typically, the reactors not only use IO multiplexing to handle network event drivers, but also implement timers to handle time-event drivers (Request timeout processing or processing of timed tasks), as in the following:

This picture has a 5-point meaning:

(1) The processing of applications is based on OO thinking, and the different types of request processing are separated. For example, a type of request is a user registration request, a Type B request is a query for the user's avatar, then when we add the user's avatar to a variety of resolution pictures, change the code processing logic of type B request, completely does not involve the type a request code modification.

(2) Apply the logic to process the request, completely separate from the event distribution framework. What do you mean? That is, when you write application processing, you do not have to call the IO multiplexing, and you do not have to call epoll_wait to handle the multiple socket connections it returns. In the application code, only cares about how to read, send data on the socket, and how to handle business logic. The event distribution framework has an abstract event interface in which all applications must implement an abstract event interface that separates applications from the framework.

(3) on the reactor to provide registration, removal of event methods, supply code use, and distribution of event methods, usually circular calls, whether to provide to the application code calls, or by the framework simple and brutal direct use, this is the framework of freedom.

(4) IO multiplexing is also an abstraction, which can be either a specific select or a epoll, and they only have to provide an active connection to all of the monitored connections for a moment.

(5) The timer is also used by the reactor object, which must provide at least 4 methods, including adding and removing timer events, which should be called by the application code. The most recent time-out is required, which is used by the reactor object to confirm the blocking timeout at the time of select or epoll_wait execution, and to prevent the wait on the IO from affecting the processing of timed events. Traversal is also used by the reactor framework to handle timed events.

The following is a minimalist process to illustrate how the reactor handles a request, and the mid-orange part is the reactor's distribution event flow:

As you can see, the distribution IO, timer events are all done by the reactor framework, and the application code only focuses on how to handle readable, writable events.

Of course, the process is extremely streamlined, and the exceptions that are actually handled are not included.

As you can see, why does the timer collection need to provide the current time of the nearest timeout event? Because, when calling epoll_wait or select, it is not always possible to pass in 1 as the timeout parameter. Because, our server main business is often the network request processing, if the network request is very few, then the CPU all time will be occupied by the frequent but unnecessary epoll_wait call. It makes sense to reduce the CPU utilization of the process at the time of the server, which can allow other processes on the server to get more execution opportunities, prolong the life of the server, and save power. In this way, it is necessary to pass the exact timeout maximum blocking time to epoll_wait.

What kind of timeout time is accurate? This is equivalent to, we need accurate analysis, what kind of session process can really rest, into sleep state?

A meaningless answer is that there is no need for a process to perform a task for a period of time that can rest.

This requires us to think carefully about what kinds of tasks the process has done, such as:

1, the processing of all network packets, such as the establishment of TCP connections, read and write, shut down, basically all the normal requests are driven by the network packet. For such tasks, no new network groupings arrive at this time, which is the period during which the process can be closed.

2, timer management, it is not related to network, IO multiplexing, although they may be related to the business. The events in the timer need to be executed in a timely manner, not for other reasons, such as blocking the processing of timed events on epoll_wait. For a period of time, when there is no timing event to reach the trigger condition (which is also the meaning of providing an interface to query the current time of the last timed event), the process can rest on the management of the scheduled task.

3. Other types of tasks, such as disk IO execution completion, or receiving signal signals from other processes, and so on, these tasks obviously do not need to be executed for a period of time during which the process can rest.

Thus, in the process code of the reactor model, the other calls are usually used in a non-blocking manner, except for IO multiplexing such as epoll_wait. Therefore, the timeout time-out for epoll_wait is the process sleep time that other tasks can allow, in addition to the network. While only considering common timer tasks, just like in the case, only the timer set is required to provide the time of the most recent timeout event to the present.

It can also be deduced from this that the timer set typically uses a data structure such as an ordered container, and the benefits are:

1, easy to take the time of the most recent timeout event.

2, you can start from the most recent time-out events, and then loop through the events that have timed out, until the first event without a timeout to stop the traversal, do not have to traverse all.

Therefore, the rough use of unordered data structures, such as regular lists, is usually unworthy. But there is no absolute, Redis is the use of a no-order list, why? Because Redis's client connections do not have a time-out concept, there is no time-out for the thousands of concurrent links. The only purpose of the Redis timer is to periodically flash the memory data to disk, so that the timing event is usually only single digit and its performance is irrelevant.

If there is a lot of timing events, the use of the combined insert, traverse, and delete frequency, the most opportunities to use the tree, such as Keng Gen (libevent), Binary balance tree (nginx red black tree). Of course, when the scene is special, it can be implemented with ordered arrays, jumping tables, etc.

In summary, the efficiency of the reactor model development is higher than the direct use of IO multiplexing, it is usually single-threaded, the design goal is to use a single CPU all the resources, but also has the advantage that each event processing in many cases can not consider the shared resources of mutually exclusive access. But the shortcomings are also obvious, now the hardware development, has no longer follow Moore's law, the CPU frequency is constrained by the material limit no longer have a big promotion, and instead is increased from the increase in the number of nuclear power, when the program needs to use multi-core resources, the reactor model will be tragic, why?

If the program business is simple, such as simply accessing some of the services that provide concurrent access, you can open multiple reactors directly, each with a CPU core, and the requests that run on these reactors are unrelated, and this is entirely possible with multicore. For example, an HTTP static server such as Nginx.

If the program is more complex, such as the processing of a piece of memory data is expected to be done by multi-core, so that the reactor model is very difficult to do, it requires an expensive cost, the introduction of many complex mechanisms. So, you can understand the services such as Redis, Nodejs, why can only be single-threaded, why memcached simple service can be multi-threaded.

High Performance network programming 6--reactor reactor and timer management

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More