Design and Development of efficient asynchronous Io
From: http://hippoweilin.mobile.spaces.live.com/arc.aspx
In my understanding, efficient asynchronous Io should not only achieve the efficiency of program running, but also the efficiency of development, which also includes quality, for many people who have never done asynchronous Io, the first attempt of asynchronous Io will certainly encounter many difficulties, because it is not only a test of programming capabilities, developers also need to have a good understanding of the I/O operations of the operating system, including its mechanisms and some principles. Asynchronous Io is also highly efficient and inefficient, but it mainly depends on the specific mechanism required by the application. For example, the well-known select method is a very common and cross-platform method. Because select requires a lot of time to maintain the IO handle, the efficiency is greatly reduced. Generally, for small-concurrency asynchronous Io operations, such as common clients or small-concurrency servers, the efficiency may be sufficient. The efficiency of select can be seen from the definition of fd_setsize on various platforms. On Windows, fd_setsize is 64, and 1024 is on Linux, that is, for platform providers, it is not expected that the SELECT statement can provide much concurrent throughput. However, due to the simplicity and popularization of the SELECT statement, it is still widely used, in many cases, it does not require too much concurrency. In fact, when talking about the development of efficient asynchronous Io, we also said that we should not only consider the efficiency of program running, but also the development efficiency and the quality of software. Here, in fact, such a simple mechanism of select is sometimes not so easy to use, and there will be many errors.
When it comes to the overhead of the repeated maintenance handle of the SELECT statement, there are actually solutions, and the efficiency of the good solution will be improved a lot, but the repetitive work still needs to be done. For example, when the return result of the SELECT statement is 0, or when we can determine that I/O handles do not need to be increased or decreased, we can simply re-write the previously saved fd_set copy, which can reduce the overhead of re-generating fd_set, the memory replication efficiency is obviously higher than the repeated traversal of the queue, which is obvious. Of course, for large concurrent I/O operations, this method is also very limited to improve efficiency. In the end, even if asynchronous I/O is adopted, the efficiency is not necessarily high, but also depends on many other factors. It is very important that you do not forget to set the listening port to asynchronous. Although it seems that nothing is abnormal after the program runs without setting it, the CPU usage is obviously high on the surface, of course there will be some other problems, which are complicated and will not be described here. The efficiency of concurrent operations also depends on the frequency of "connection --> disconnection-> connection". Frequent "connection --> disconnection-> connection" may also produce a large amount of overhead, of course, these overhead are much lower than the overhead required to truly implement business operations, and there are also many ways to circumvent them, different asynchronous Io implementation methods are also very different, and the efficiency varies greatly. For us, the final analysis is to use different methods based on different models to improve efficiency, as a front-end "generator" of asynchronous Io, we should try our best to avoid consuming too much CPU resources in our work. Instead, we should try our best to give the CPU resources to specific business implementers.
Edge-triggered and level-triggered in asynchronous I/O are very important concepts. Edge-triggered literally refers to "boundary triggering ", this is triggered when the status changes. In the future, if the status remains unchanged or the system is not requested to give a new notification, the application will not be notified. Level-triggered refers to "status trigger ", it means triggering in a certain state. If it remains in this state, it will always be triggered. The two trigger methods have their own purposes. Different trigger methods should be adopted according to different applications. Select generally uses level-triggered by default, while epoll can either use edge-triggered or level-triggered. The default value is level-triggered, in this definition, Ms cpio should belong to edge-triggered. For the encapsulated asynchronous I/O architecture, the specific method is actually harmless, because no matter which method is used, it must be implemented internally correctly, in addition, it is recommended that users no longer care about this specific trigger method.
Void epollreactor: notifymewrite (socket handle, svchandler * Handler)
{
Diamon_assert (handle! = Invalid_socket );
Epoll_mod_handle_events (handle, epollout/* | epollet */);
}
In the above Code, the asynchronous I/O framework needs to notify the application about writing data each time. The epoll_mod_handle_events function tells the system to register the epollout message for handle, in this way, after handle completes the write operation, the system will notify the framework to write the message. Whether or not to add epollet depends on the protocol between the framework and the application. In fact, it is essentially the interface and call Convention provided by the framework. In diamon: Ace, the level-triggered mode is used.
Void IOCPReactor: NotifyMeWrite (SOCKET handle, SvcHandler * handler)
{
DIAMON_ASSERT (handle! = INVALID_SOCKET );
IOCPSvcHandler * iocphandler = (IOCPSvcHandler *) handler;
Iocphandler-> event _ & = (~ IOCP_EVENT_READ );
Iocphandler-> event _ | = IOCP_EVENT_WRITE;
}
In CPIO, NotifyMeWrite does not notify the system, but tells the asynchronous I/O framework to handle write events. The triggering of the system is completely handed over to WSAWrite... and so on. Think about it. Every time you call WSAWrite... not exactly to the system. I will register a write event for this handle. I will be notified next time. You can try not to call WSAWrite... no write notification will be received next time.
Synchronization is a very important but often overlooked problem in the efficient asynchronous IO Design (not to be abused). Poor Synchronization Methods sometimes limit the use of the application layer. I have made the following mistake: select operations and FD_SET operations must be performed sequentially; otherwise, unexpected consequences may occur. We know that after writing data, we often need to inform the application layer of write notifications. Therefore, we need to set handle in FD_SET after the write operation, however, the call to the write method and the call to the select method may be in two threads. When I first encountered this problem, I simply added a mutex operation to the periphery of the select statement. When the application layer directly calls the write method when receiving the read notification, there is no problem, at this time, the call to the write method and the select call are in the same thread (it must be executed sequentially). On the surface, this problem is indeed solved, but it actually hides some serious problems. When the write method is in another "service processor", such as in another thread, this call may result in the simultaneous operation of FD_SET and select operations. When this happens, it is obvious that the program cannot run normally. For applications, the only way to solve this problem is to know the mutex object on the periphery of the select statement, and then package a pair of mutex operations on the periphery of the Self-called write method, although the FD_SET operation and select operation synchronization are solved, the problem is actually more complicated. For example: 1. Let the application know that this is not required, the logic that should not be known leads to the complexity of application layer development, and even brings more serious problems to applications due to incorrect use; 2. There will always be several situations that will lead to the issue of mutual lock. The second problem may be a bit complicated. For ease of understanding, I would like to explain the mutual lock. In this case, two tasks use mutexA and mutexB respectively, the sequence used in Task 1 is ..., mutexA. lock (),... mutexB (),..., the sequence used in Task 2 is ..., mutexB. lock (),... mutexA (),..., when Task 1 occupies mutexA and waits for mutexB, Task 2 also occupies mutexB and waits for mutexA. At this time, it is obvious that an interlock (cross lock) is created ). The second problem is the mutual lock caused by such calls. In short, this kind of brainless solution causes serious application layer problems.