For objective reasons, the new developments in Linux have failed to keep up with the new development of the Linux kernel, especially in the ever-changing Linux kernel. This article explains the main differences between select/poll in UNIX and epoll in Linux.
1. support a process to open a large number of socket Descriptors (FD)
The most intolerable thing about select isThe FD opened by a process has certain limitations.,Fd_setsizeSet, the default value is 2048. For im servers that need to support tens of thousands of connections, there are obviously too few. At this time, you areYou can choose to modify this macro and then re-compile the kernel.,However, the materials also pointed out that this will bring about a decline in network efficiency.Second, you can select a multi-process solution (the traditional Apache solution). However, although the cost of creating a process on Linux is relatively small, it cannot be ignored, in addition, data synchronization between processes is far less efficient than inter-thread synchronization, so it is not a perfect solution. However, epoll does not have this limit. The FD limit supported by the lock is the maximum number of files that can be opened. This number is generally greater than 2048. For example, the size of a machine with 1 GB of memory is about 0.1 million. You can check the number of machines with CAT/proc/sys/fs/file-max. Generally, this number has a great relationship with the system memory.
2. Io efficiency does not decrease linearly as the number of FD increases
Another weakness of the traditional select/poll is that when you have a large set of sockets, but due to network latency, only some of the sockets are "active" at any time, however, each select/poll call will linearly scan all sets, resulting in a linear decline in efficiency. However, epoll does not have this problem. It only performs operations on "active" sockets-this is because epoll is implemented based on the callback function on each FD in kernel implementation. Then, only the "active" socket will actively call the callback function, and other idle status socket will not. At this point, epoll implements a "pseudo" AIO, this is because the driver is in the OS kernel. In some benchmarks, if all the sockets are basically active-for example, in a high-speed LAN environment, epoll is not more efficient than select/poll. On the contrary, if epoll_ctl is used too much, the efficiency is also slightly lower. However, once idle connections is used to simulate the WAN environment, epoll is far more efficient than select/poll.
3. Use MMAP to accelerate message transmission between the kernel and user space
This actually involves the specific implementation of epoll. Both select, poll, and epoll require the kernel to notify users of FD messages. It is important to avoid unnecessary memory copies, epoll is implemented through the same memory of the user space MMAP kernel. If you focus on epoll from the 2.5 kernel like me, you will not forget the manual MMAP step.
4. kernel fine-tuning
This is not an advantage of epoll, but an advantage of the entire Linux platform. Maybe you can doubt the Linux platform, but you cannot avoid the Linux platform giving you the ability to fine-tune the kernel. For example, if the Kernel TCP/IP protocol stack uses a memory pool to manage the sk_buff structure, you can dynamically adjust the memory pool (skb_head_pool) during runtime) by echo XXXX>/proc/sys/NET/CORE/hot_list_length. For example, the listen function's 2nd parameters (TCP completes the length of the packet queue after three handshakes) can also be dynamically adjusted based on the memory size of your platform. Even in a special system with a large number of data packets but the size of each data packet itself is small, try the latest napi NIC driver architecture.
Reference: the original document is no longer available.