(1) From User space copy fd_set to kernel space;
(2) Register callback function __pollwait;
(3) Traverse all FD, do one poll for all specified devices (here The poll is a file operation, it has two parameters, one is the file FD itself, one is called when the device is not ready to call the callback function __pollwait, this function to the device's own unique waiting queue to the kernel, Let the kernel attach the current process to it);
(4) When the device is ready, the device wakes up the "all" node in its own waiting queue, and the current process acquires the completed signal. The poll file operation returns a standard set of masks in which each bit indicates the current different ready state (all 0 is not triggered by any event), and the fd_set can be assigned according to mask;
(5) If the mask returned by all devices does not show any event triggering, remove the function pointer of the callback function, enter a limited sleep state, then restore and keep doing poll, and then make a limited sleep until one of the devices has an event trigger.
Whenever an event is triggered, the system call returns, copying the fd_set from the kernel space to the user space, and returning to the user state, the user can further read or write the associated FD.
A select () system tune is used to monitor an array that contains multiple file descriptors, and when Select returns, the ready file descriptor in the array is changed by the kernel to the flag bit.
Select's cross-platform is well-supported by almost every platform.
The select disadvantage has the following four points:
(1) Maximum limit on the number of file descriptors that a single process can monitor
(2) The data structure maintained by Select () stores a large number of file descriptors, and as the number of file descriptors increases, the overhead of replication of the address space of the user state and the kernel increases linearly
(3) At the same time each call to select needs to pass in the kernel traversal all the FD, this overhead is very large when FD many.
(4) Due to the delay in network response time makes a large number of TCP connections inactive, but the call to select () or all the sockets will be a linear Scan, will cause some overhead
Poll
-
poll is similar to the implementation of select, the only problem is poll adoption POLLFD structure pointer implementation,
- Epoll Principle Overview
When calling Epoll_create , the following things are done:
The kernel helped us build a file node in the Epoll filesystem.
A red-and-black tree was built in the kernel cache to store the socket that came from Epoll_ctl;
Create a list linked list that stores the events that are ready for use.
When calling Epoll_ctl , the following things are done:
Place the socket on the red-black tree of the file object in the Epoll filesystem;
Register a callback function with the kernel interrupt handler and tell the kernel that if the handle is interrupted, put it in the Ready list.
When calling epoll_wait , the following things are done:
See if there is any data in the List link table. There is data on the return, no data on sleep, wait until timeout time to even if the linked list no data also returned. And, often, even if we're going to monitor millions of handles, most of the time we only return a small amount of ready-to-use handles, so epoll_wait only needs to copy a small number of handles from the kernel state to the user state.
Epoll Advantages
(1) Support a process to open a large number of socket descriptors (FD)
Select the most unbearable is a process opened by the FD is a certain limit, set by Fd_setsize, the default value is 2048. It is obviously too small for the number of connected IM servers that need to be supported. This time you
One is the choice to modify the macro and then recompile the kernel, but the data also pointed out that this will lead to a decline in network efficiency,
The second is the choice of multi-process solutions (the traditional Apache scheme), but although the cost of the creation process of Linux is relatively small, but still can not be ignored, plus the process of data synchronization is far less efficient than synchronization between threads, so it is not a perfect solution.
Epoll does not have this restriction, it supports the FD limit is the maximum number of open files, this number is generally far greater than 2048, for example, in 1GB memory of the machine about about 100,000, the specific number can be cat/proc/sys/fs/file-max to see, In general, this number is very much related to system memory.
(2) IO efficiency does not decrease linearly with increasing number of FD
Another Achilles heel of traditional select/poll is when you have a large socket set, but because of network latency, only some of the sockets are "active" at any one time, but select/poll each call will scan the entire collection linearly. resulting in a linear decrease in efficiency.
-
epoll doesn't have this problem, it only operates on "active" Sockets--- This is because the Epoll is implemented in the kernel implementation based on the callback function above each FD. Then, only the "active" socket will be active to call the callback function, the other idle state socket will not, at this point, Epoll implemented a "pseudo" AIO, because this time the driving force in the OS kernel. In some benchmark, if all sockets are basically active---such as a high-speed LAN environment, Epoll is no more efficient than select/poll, and conversely, if you use epoll_ctl too much, there is a slight decrease in efficiency. But once you use the idle
connections to simulate a WAN environment, epoll is far more efficient than select/poll.
(3) Use MMAP to accelerate message delivery between the kernel and user space
This actually involves the concrete implementation of the Epoll. Both Select,poll and epoll need the kernel to inform the user of the FD message, how to avoid unnecessary memory copy is very important, at this point, epoll through the kernel in the user space mmap the same piece of memory implementation. And if you want me to follow epoll from the 2.5 kernel, you will never forget to mmap this step manually.
-
(4) kernel fine tuning
-
-
(5) epoll provides edge triggering (edge triggered) in addition to the level trigger (leveltriggered) for Select/poll IO events, which makes it possible for user space programs to cache IO status and reduce epoll_wait/ Epoll_pwait calls to improve application efficiency.
Disadvantages of Epoll:
The disadvantage to compare to take the current several network model select IOCP compared with the disadvantages of the following:
1. Compared to select, Epoll's cross-platform is not enough to work on Linux, and select can be used on Windows Linux Apple, as well as mobile Android iOS. Android is the kernel of Linux But earlier versions also did not support Epoll.
2. Relative to select is still a bit more complicated, but compared with IOCP added a little bit of complexity but basically reached the IOCP concurrency and performance, and the complexity is much smaller than IOCP.
3. Compared to IOCP, multi-core/multi-threaded support is not good enough, performance is therefore more stringent performance requirements than IOCP.
This article is from the "Plato's Eternal" blog, please be sure to keep this source http://ab3813.blog.51cto.com/10538332/1793437