Epoll and select I/O model research

Source: Internet
Author: User
Tags data structures epoll int size readable

horizontal trigger and Edge trigger

Both of these triggers are common event triggers in the I/O model, so it's important to mention the difference between these two words from the computer hardware design. The difference is that as long as the handle satisfies a certain state, the level trigger will give a notification, and the Edge trigger will only notify when the handle state changes. For example, a socket after a long wait to receive a period of 100k of data, both trigger will send a ready notification to the program. Assuming that the program reads 50k data from this socket and calls the listener again, the horizontal trigger will still issue a ready notification, and the edge trigger will not be notified and trapped for a long time because the socket "has data readable" status.

So when using an edge-triggered API, be careful to read every time that the socket returns to Ewouldblock.

Select Model

Select was born in 4.2BSD and is supported on almost all platforms, and its good cross-platform support is one of its main and few advantages.

The disadvantage of select: the maximum number of file descriptors that a single process can monitor, usually 1024, and of course the number can be changed, but the more the file descriptor is scanned by the select polling, the worse the performance; kernel/user space memory copy problem, The select needs to replicate a large number of handle data structures, resulting in huge overhead; The select returns an array of the entire handle, and the application needs to traverse the entire array to discover which handles have occurred; Select is triggered horizontally. If an application does not perform an IO operation on a file descriptor that is already ready, the file descriptor will be notified to the process each time after the select call.

Poll

Poll was born in Unix System V Release 3, when At&t had stopped source code authorization for UNIX, so it was clear that the BSD select was not directly used, so at&t itself implemented a poll that didn't make much difference to the select.

Poll and select are twins with different names, except that there is no limit on the number of files to monitor, poll uses a linked list to save the file descriptor, so there is no limit, and the following 3 disadvantages of the Select are also applicable to poll.

Epoll

Epoll was born in the Linux 2.6 kernel and is considered to be the best Linux2.6 multiplex Io multiplexing method under the performance.

Epoll Features: Epoll does not have the maximum concurrent connection limit, the upper limit is the maximum number of open files, this number is generally far greater than 2048, in general, this number and system memory relationship is very large, the specific number can be cat/proc/sys/fs/file-max; efficiency improvement , the greatest advantage of Epoll is its "active" connection, it has nothing to do with the total number of connections, so in a real-world network environment, Epoll is much more efficient than select and poll; memory copies, Epoll use "Shared memory" on this point, and this memory copy is omitted.

Epoll interface, the Epoll interface is very simple, with three functions:

1. int epoll_create (int size);

Creates a epoll handle, which is used to tell the kernel how large the number of listeners is. This parameter differs from the first parameter in select () and gives the value of the maximum listener fd+1. Note that when you create a good epoll handle, it will occupy an FD value, under Linux if you look at the/proc/process id/fd/, you can see this fd, so after using Epoll, you must call Close () closed, otherwise it may cause FD to be depleted.

2. int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event);

The Epoll event registration function, which differs from select () is to tell the kernel what type of event to listen to when listening to an event, but to register the type of event to listen for first. The first parameter is the return value of Epoll_create (), and the second parameter represents the action, which is represented by three macros:
Epoll_ctl_add: Register the new FD to EPFD; Epoll_ctl_mod: Modify the Listening event of the registered FD; Epoll_ctl_del: Remove an FD from the EPFD;

The third parameter is the FD that needs to be monitored, and the fourth parameter tells the kernel what to listen to, and the struct epoll_event structure is as follows:

struct Epoll_event {

__uint32_t events; * Epoll Events * *

epoll_data_t data; /* USER Data variable * *

};

Events can be a collection of several macros:
Epollin: Indicates that the corresponding file descriptor can be read (including normal shutdown of the socket); Epollout: Indicates that the corresponding file descriptor can be written; Epollpri: indicates that the corresponding file descriptor has an urgent data readable (this should indicate the arrival of Out-of-band data); Epollerr: Indicates that the corresponding file descriptor has an error; Epollhup: Indicates that the corresponding file descriptor was hung; Epollet: Set Epoll as Edge-triggered (edge triggered) mode, which is relative to horizontal (level triggered) To say. Epolloneshot: Listen to only one event, when listening to the event, if you still need to continue to listen to this socket, you need to add this socket to the Epoll queue

3. int epoll_wait (int epfd, struct epoll_event * events, int maxevents, int timeout);

Wait for the event to occur, similar to the Select () call. Parameter events are used to get the collection of events from the kernel, maxevents the kernel of this event, the Maxevents value cannot be greater than the size when the Epoll_create () is created, and the parameter timeout is the timeout (milliseconds, 0 will return immediately ,-1 will be uncertain, and there are claims that it is permanently blocked. The function returns the number of events that need to be handled, such as returning 0 to indicate that the timeout has expired.

Through the above analysis, we can get the reason why select is less efficient than Epoll: Select is polling, Epoll is trigger, so epoll efficiency is high.

Select mode inefficiencies are determined by the definition of SELECT, regardless of the operating system implementation, and any kernel must do rounds to realize the select in order to know the condition of these sockets, which consumes the CPU. Also, when you have a large socket set, although only a small portion of the socket is "active" at any one time, you have to fill all the sockets into a fd_set, which consumes some CPU, and when the select returns, You may also need to do "contextual mapping" when dealing with a business, and there will also be some performance impact, so select is relatively inefficient than epoll.
Epoll's application scenario is a large number of sockets, but active is not very high.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.