Epoll's improvements to poll

Source: Internet
Author: User
Tags epoll

For objective reasons, the new developments in Linux have failed to keep up with the new development of the Linux kernel, especially in the ever-changing Linux kernel. This article explains the main differences between select/poll in UNIX and epoll in Linux.

1. support a process to open a large number of socket Descriptors (FD)

The most intolerable thing about select isThe FD opened by a process has certain limitations.,Fd_setsizeSet, the default value is 2048. For im servers that need to support tens of thousands of connections, there are obviously too few. At this time, you areYou can choose to modify this macro and then re-compile the kernel.,However, the materials also pointed out that this will bring about a decline in network efficiency.Second, you can select a multi-process solution (the traditional Apache solution). However, although the cost of creating a process on Linux is relatively small, it cannot be ignored, in addition, data synchronization between processes is far less efficient than inter-thread synchronization, so it is not a perfect solution. However, epoll does not have this limit. The FD limit supported by the lock is the maximum number of files that can be opened. This number is generally greater than 2048. For example, the size of a machine with 1 GB of memory is about 0.1 million. You can check the number of machines with CAT/proc/sys/fs/file-max. Generally, this number has a great relationship with the system memory.

 2. Io efficiency does not decrease linearly as the number of FD increases

Another weakness of the traditional select/poll is that when you have a large set of sockets, but due to network latency, only some of the sockets are "active" at any time, however, each select/poll call will linearly scan all sets, resulting in a linear decline in efficiency. However, epoll does not have this problem. It only performs operations on "active" sockets-this is because epoll is implemented based on the callback function on each FD in kernel implementation. Then, only the "active" socket will actively call the callback function, and other idle status socket will not. At this point, epoll implements a "pseudo" AIO, this is because the driver is in the OS kernel. In some benchmarks, if all the sockets are basically active-for example, in a high-speed LAN environment, epoll is not more efficient than select/poll. On the contrary, if epoll_ctl is used too much, the efficiency is also slightly lower. However, once idle connections is used to simulate the WAN environment, epoll is far more efficient than select/poll.

 3. Use MMAP to accelerate message transmission between the kernel and user space

This actually involves the specific implementation of epoll. Both select, poll, and epoll require the kernel to notify users of FD messages. It is important to avoid unnecessary memory copies, epoll is implemented through the same memory of the user space MMAP kernel. If you focus on epoll from the 2.5 kernel like me, you will not forget the manual MMAP step.

 4. kernel fine-tuning

This is not an advantage of epoll, but an advantage of the entire Linux platform. Maybe you can doubt the Linux platform, but you cannot avoid the Linux platform giving you the ability to fine-tune the kernel. For example, if the Kernel TCP/IP protocol stack uses a memory pool to manage the sk_buff structure, you can dynamically adjust the memory pool (skb_head_pool) during runtime) by echo XXXX>/proc/sys/NET/CORE/hot_list_length. For example, the listen function's 2nd parameters (TCP completes the length of the packet queue after three handshakes) can also be dynamically adjusted based on the memory size of your platform. Even in a special system with a large number of data packets but the size of each data packet itself is small, try the latest napi NIC driver architecture.

 

Reference: the original document is no longer available.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.