(a) first, introduce several common I/O models and their differencesAs follows:
Blocking I/O
nonblocking I/O
I/O multiplexing (Select and poll)
Signal driven I/O (SIGIO)
asynchronous I/O (the POSIX aio_functions) ————— The largest feature of the asynchronous IO model is the notification after completion.
Blocking or not depends on how the IO Exchange is implemented.
Asynchronous blocking is based on the implementation of the Select,select function itself, but the advantage of using the Select function is that it can listen to multiple file handles at the same time.
Asynchronous non-blocking directly after completion of the notification, the user process only need to initiate an IO operation and then immediately return, and so on when the IO operation is really completed, the application will be the completion of the IO operation notification, at this time the user process only need to process the data is good, do not need to do the actual IO read and write operations, Because the actual IO read or write operation has been completed by the kernel.
1 blocking I/O
This doesn't have to be explained, blocking sockets. Is the diagram of the procedure it invokes:
The key explanation is that the following example will cover this. First application calls Recvfrom () to kernel, note that kernel has 2 processes, wait for data and copy data from kernel to user. Recvfrom () is not returned until the final copy is complete. This process is always blocked.
2 nonblocking I/O:
In contrast to blocking I/O, non-blocking sockets, the calling process diagram is as follows:
As you can see, if you manipulate it directly, it's a poll. Until the kernel buffers have data.
3 I/O multiplexing (SELECT and poll)
The most common I/O multiplexing model, select.
Select Blocks first, and there is an active socket to return. Select will have two system calls compared to blocking I/O ,But select can handle multiple sockets.
4 signal driven I/O (SIGIO)
Only UNIX system support, interested in the course to check the relevant information
Compared to the I/O multiplexing (SELECT and poll) , it has the advantage of eliminating the blocking and polling of select, which is handled by the registered handler when there is an active socket.
5 Asynchronous I/O (the POSIX aio_functions)
There are few *nix system support, and the IOCP of Windows is this model
A fully asynchronous I/O multiplexing mechanism, because looking at at least the other four models above will be blocked by kernel copy data to appliction. The model is only notified when copy is complete and application is purely asynchronous. It seems that the only Windows completion port is this model, and it's very efficient.
6 Below is a comparison of the above five models
As can be seen, the more backward, the less the blockage, the theoretical efficiency is also optimal.
===================== Split Line ==================================
The comparison between the 5 models is clear, and the rest is to put the select,epoll,iocp,kqueue by the number of seats that OK.
Select and IOCP correspond to the 3rd and 5th models respectively, so what about Epoll and kqueue? In fact, the select belongs to the same model, but more advanced, can be seen as having a 4th model of certain characteristics, such as the callback mechanism.
Why epoll,kqueue than select Advanced?
The answer is that they are not polled. Because they replaced it with callback. Think about it, when the socket is more, each time select () through the traversal fd_setsize a socket to complete the dispatch, regardless of which socket is active, all traverse over again. This can waste a lot of CPU time. If you can register a callback function with the socket, and when they are active, the related actions are automatically done, then polling is avoided, which is exactly what Epoll and Kqueue do.
windows or *nix (IOCP or Kqueue/epoll)?
Admittedly, Windows IOCP is very good and there are few systems that support asynchronous I/O , but because of the limitations of the system itself, large servers are still under UNIX. And as mentioned above, Kqueue/epoll and IOCP are more than a layer of blocking from the kernel copy data to the application layer, and thus cannot be counted as asynchronous I/O classes. However, this layer of small blockage is insignificant, kqueue and Epoll have done a very good job.
provides a consistent interface, IO Design Patterns
In fact, no matter what kind of model, can be abstracted out, provide a consistent interface, well-known ace,Libevent (based on reactor mode) these, they are cross-platform, and they automatically choose the best I/O multiplexing mechanism, The user simply invokes the interface. Speaking of which, we have to talk about 2 design patterns,Reactor and proactor. See: Reactor mode--vs--proactor mode. Libevent is the reactor model, the ACE provides the Proactor model. The actual encapsulation of various I/O multiplexing mechanisms.
What is the I/O mechanism of the Java NiO package?
It is now possible to determine that the current Java essence is the Select () model , which can be checked by/jre/bin/nio.dll. As for why the Java server is very efficient. I don't know, maybe it's a better design. -_-。
===================== Split Line ==================================
summarize some highlights:
- Only IOCP is asynchronous I/O, and other mechanisms are more or less blocked at all.
- Select is inefficient because it requires polling every time. But inefficient is also relative, depending on the situation, but also through a good design to improve
- Epoll, Kqueue, select is Reacor mode, IOCP is Proactor mode.
- The Java NIO package is a select model.
(ii) The difference between Epoll and select
1. Use multi-process or multi-threading, but this approach can complicate the program, and it also requires a lot of overhead for creating and maintaining processes and threads. (Apache server is the way to use child processes, advantages can isolate users) (synchronous blocking IO)
2. A better way for I/O multiplexing (I/O multiplexing) (seemingly also translated multiplexing), first constructs a list of descriptors (Epoll in the queue), and then calls a function until one of these descriptors is ready to return, returning to tell the process which i/ O ready. The two mechanisms of select and Epoll are multi-path I/O mechanism solutions, select is the POSIX standard, and Epoll is unique to Linux.
The difference (epoll relative to the Select advantage) is mainly three:
1. The number of handles for select is limited, and the Linux/posix_types.h header file has such a declaration: #define __FD_SETSIZE 1024 indicates that select listens for up to 1024 FD at a time. While Epoll does not, its limit is the maximum number of open file handles.
2.the biggest benefit of epoll is that it does not decrease efficiency with the number of FD, polling processing in Selec, where the data structure is similar to the data structure of an array, and epoll is to maintain a queue, looking directly at whether the queue is empty. Epoll only operates on "active" sockets---This is because Epoll is implemented in the kernel implementation based on the callback function above each FD. Then, only the "active" socket will be active to call the callback function (put this handle into the queue), the other idle state handle will not, at this point, Epoll implemented a "pseudo" AIO. However, if the majority of I/O are "active" and the usage of each I/O port is high, the epoll efficiency is not necessarily higher than select (possibly to maintain the queue complexity).
3. Use Mmap to accelerate message delivery between the kernel and user space. Both Select,poll and epoll need the kernel to inform the user of the FD message, how to avoid unnecessary memory copy is very important, at this point, epoll through the kernel in the user space mmap the same piece of memory implementation.
about Epoll working mode Et,lt
Epoll has two ways of working
Et:edge triggered, edge trigger. Only notifies when the state changes, EPOLL_WAIT returns. In other words, there is only one notification for an event. And only non-blocking sockets are supported.
Lt:level triggered, level trigger (default mode of operation). Similar to Select/poll, as long as there is no processing of the event will be notified, in the LT mode Call Epoll interface, it is equivalent to a faster poll. Supports blocking and non-blocking sockets.
Three Linux concurrent Network programming model
1 Apache model, abbreviated as PPC (process per Connection,): Assigns a process to each connection. The time and space at which hosts are allocated to each connection is expensive, and as the number of connections increases, the overhead of switching between processes increases. It is difficult to handle a large number of customer concurrent connections.
2 TPC Model (thread per Connection): one thread per connection. Similar to the PCC.
3 Select model: I/O multiplexing technology.
.1 each connection corresponds to a description. The Select model is limited to fd_setsize, which is the maximum number of linux2.6.35 to open a process, and the number of strokes that can be opened by Linux per process is limited to memory size, but is a reference when designing a system call for select Fd_ The value of the setsize. This value can be changed by recompiling the kernel, but it is not a cure for this problem, and it is still a drop in the bucket for millions user connection requests even if the number of processes is increased.
.2select scans a collection of file descriptors each time, and the size of the collection is the value passed in as the first parameter of the Select. But each process can open the file description characters is increased, and the efficiency of the scan will be reduced.
.3 kernel-to-user space, using memory replication to deliver the information that occurs on the file description.
4 Poll Model: I/O multiplexing technology. The poll model will not be limited to fd_setsize because the size of the collection of file descriptors scanned by the kernel is user-specified, which is the second parameter of poll. However, there are still scanning efficiency and memory copy issues.
5 Pselect Model: I/O multiplexing technology. Same as SELECT.
6 Epoll Model:
.1) No file descriptor size limit is only related to memory size
.2) Epoll returned with a clear understanding of which socket FD what happened, do not like select as a comparison.
.3) The kernel-to-user space uses shared memory mode to deliver messages.
Four: FAQs
1, a single epoll does not solve all problems, especially when you each operation is more time-consuming, because the epoll is serial processing. So you have the need to build a line pool to play more effective.
2. If the FD is registered to two Epoll, two epoll will trigger the event if there is time.
3. If the FD registered in Epoll is closed, it will automatically be cleared out of the Epoll listener list.
4. If multiple events trigger Epoll at the same time, multiple events will be returned together.
5, Epoll_wait will always listen to the Epollhup event occurs, so it does not need to add to events.
6, in order to avoid large data volume IO, et mode only deal with one FD, other FD is starved of the situation occurs. Linux recommends that you add the ready bit to the structure that the FD links to, and then epoll_wait the trigger event to only set it to Ready mode, and then polls the Ready FD list at the bottom.
Reference:
http://blog.csdn.net/ysu108/article/details/7570571
Http://techbbs.zol.com.cn/1/8_2245.html
Transferred from: http://blog.csdn.net/wenbingoon/article/details/9004512
Go IO model and the difference between Select, poll, Epoll, and Kqueue