5--io multiplexing and concurrent programming for high performance network programming

Source: Internet
Author: User
For the concurrency of the server, what we need is that every millisecond server will be able to handle the hundreds of different TCP connections received in a millisecond in time, while at the same time there may be a relatively inactive connection on the server with 100,000 of the last few seconds without sending or receiving any messages. Simultaneous processing of multiple concurrent event connections, referred to as concurrency, and the simultaneous processing of 100,000-meter connections, is high concurrency. The concurrent programming of the server is to deal with the number of concurrent connections is infinite, while maintaining the efficient use of resources such as CPU, until the physical resources are exhausted first.
There are many implementation models for concurrent programming, and the simplest is to bundle with threads, with 1 threads handling the entire lifecycle of 1 connections. Pros: This model is simple enough to enable complex business scenarios, while the number of threads can be much larger than the number of CPUs. However, the number of threads can not be infinitely larger, why not. Because when the thread executes is determined by the operating system kernel scheduling algorithm, the scheduling algorithm does not consider a thread may be just for a connection service, it will do unification play: time slices to execute, even if the thread of execution will have to continue to sleep. It is inexpensive to wake up, sleep threads in a few times, but if the operating system has a large number of threads, it is expensive (amplified) because this technical scheduling loss affects the time that the business code executes on the thread. For example, most of the threads that have inactive connections are like our SOEs, which are inefficient in their execution, always waking up to sleep without effort, while waking up to CPU resources, which means that the private enterprise thread that handles the active connection reduces the chance of getting the CPU, the CPU is the core competency, Its inefficiency in turn affects the overall GDP throughput. What we are looking for is concurrent processing hundreds of thousands of connection, when thousands of threads appear, the system execution efficiency cannot satisfy high concurrency.
For high-concurrency programming, there is only one model at the moment, and is essentially the only effective gameplay. From the first 4 articles in this series, the message processing on the connection can be divided into two stages: waiting for the message to be ready and message processing. When using the default blocking sockets (for example, the above mentioned 1 threads bundle to handle 1 connections), it is often to merge the two phases into one, so that the operation of the socket code on the thread will have to sleep to wait for the message ready, resulting in high concurrent thread will be frequent sleep, wake up, thus affecting the efficiency of CPU usage.
The high-concurrency programming approach, of course, is to separate the two phases. That is, a piece of code that waits for a message to be prepared is detached from the code snippet that handles the message. Of course, this also requires that the socket must be non-blocking, otherwise, the code snippet that handles the message can easily cause the condition to be unsatisfied, and the thread goes into the sleep wait phase. So the question is, wait for the message to be ready. How to achieve this stage. It's still waiting, which means the thread still sleeps. The workaround is for the thread to actively query, or to have 1 threads waiting for all connections. This is the IO multiplexing. Multiplexing is the process of waiting for a message to be ready, but it can handle multiple connections at the same time. It may also "wait", so it will also cause the thread to sleep, however this does not matter because it is one-to-many and it can monitor all connections. Thus, when our threads are awakened, there must be some connections ready to be executed by our code, which is efficient. Not many threads are scrambling to handle the "Waiting for messages ready" stage, and the whole world is finally clean.
Multiplexing has a lot of implementations, on Linux, the 2.4 kernel is mainly select and poll, now the mainstream is epoll, they use the method seems very different, but the essence is the same. Efficiency is also different, which is why Epoll completely replaced the select.
Simply talk about why Epoll is a substitute for select. As mentioned earlier, the core solution for high concurrency is that 1 threads handle all connections "waiting for messages to be ready", which is epoll and select is undisputed. But the select estimate is wrong, as we said at the outset, when hundreds of thousands of concurrent connections exist, there may be only hundreds of active connections per millisecond, while the remaining hundreds of thousands of connections are inactive in this millisecond. The use of select is as follows: When the Select method is invoked, the returned active connection ==select (all pending connections). It should be called when you think you need to find an active connection where the message arrives. Therefore, calling select is frequently called when high concurrency occurs. In this way, the frequent invocation of the method is necessary to see if it is efficient, because its slight loss of efficiency will be "frequent" the word magnified. Does it have a loss of efficiency? It is obvious that the total number of connections to be monitored is 100,000, and only hundreds of active connections are returned, which in itself is ineffective. When zoomed in, you will find that when dealing with tens of thousands of connections, select is completely overwhelmed.
Look at a few figures. When concurrent connections are below 1000, select does not perform frequently, and Epoll does not seem to have much difference:
However, once the concurrency number goes up, the disadvantage of select is infinitely amplified by "execute frequently", and the more concurrent the number the more obvious:
Let's talk about how epoll is solved. It's smart to use 3 ways to do what the Select method does: Create a new Epoll descriptor ==epoll_create () Epoll_ctrl (Epoll descriptor, add or remove all pending connections) to return the active connection ==epoll_wait ( Epoll descriptor) The advantage of doing this is that it distinguishes between frequently invoked and infrequently invoked operations. For example, Epoll_ctrl are not frequently called, and epoll_wait are called very frequently. At this point, epoll_wait has almost no entry, which is much more efficient than select, and it does not increase the number of incoming arguments as the concurrent connection increases, resulting in a decrease in kernel execution efficiency.
How did the Epoll come true? In fact, it is very simple, from these 3 methods can be seen, it is smarter than select to avoid every time "which connections are already in the message ready stage" of the epoll_wait, it is not necessary to put all the incoming connection to monitor. This means that it maintains a data structure in the kernel state to hold all the connections to be monitored. This data structure is a red-black tree, its nodes increase, decrease is done by Epoll_ctrl. It is very simple with the diagram I drew in the 8th chapter of "Deep Understanding Nginx":
The red and black trees in the lower left of the figure are made up of the connections to be monitored. The linked list at the top left, with all active connections now. As a result, epoll_wait executes just checking the linked list at the top left and returning the connection to the user in the linked list at the top left. In this way, the performance of epoll_wait can not be high.
Finally, take a look at the 2 ways Epoll offers ET and LT, which are translated by edge triggering and horizontal triggering. In fact, these two Chinese names are also somewhat pertinent. The 2 ways of using it are still the efficiency issue, but it's just how the connections returned by Epoll_wait can be more accurate. For example, we need to monitor whether a write buffer for a connection is idle, and we can send the response call write to the client when it satisfies the "writable" state. However, when the connection is writable, our "response" content is still on disk, and if the disk reads are not completed yet. Must not cause the thread to block, then the response is not sent. However, the next time epoll_wait may return this connection to you, you have to check whether to deal with it. It is possible that our program has another module dedicated to disk IO, which sends a response when disk IO is complete. So, every time epoll_wait returns this "writable" connection that cannot be processed immediately, does it conform to the user's expectations?
As a result, the ET and LT models emerged. LT is the connection that satisfies the expected state each time, all must return in the epoll_wait, therefore it treats equally, is on a horizontal line. ET is not, it tends to return the connection more precisely. In the example above, once the connection becomes writable for the first time, if the program does not write any data to the connection, the next epoll_wait will not return the connection. ET is called edge trigger, which means that only the connection is transferred from one state to another, epoll_wait is triggered to return it. As can be seen, ET's programming is much more complex, at least the application should be careful to prevent the epoll_wait return of the connection appears: write when the data is not written, but expect the next "writable", readable unread data but look forward to the next "readable."
Of course, there is no big difference in their performance from the general application scenario, and the potential advantage of ET is that the number of calls to the epoll_wait will be reduced, and in some scenarios the connection will not wake up when unnecessary wakeup (this wake refers to epoll_wait return). But if, as I said above, sometimes it is not simply a network problem, it is related to the application scenario. Of course, most of the open source framework is based on ET writing, the framework, it is the pursuit of pure technical problems, of course, strive for perfection.
Finally pull down the ticket ha: Http://vote.blog.csdn.net/blogstaritem/blogstar2013/russell_tao

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.