Select and epoll in Multi-I/O multiplexing in linux, and epoll in multiple IO multiplexing

Source: Internet
Author: User

Select and epoll in Multi-I/O multiplexing in linux, and epoll in multiple IO multiplexing

Select and epoll are IO multiplexing mechanisms. I/O multiplexing uses a mechanism to monitor multiple descriptors. Once a descriptor is ready (generally read or write), it can notify the program to perform corresponding read/write operations. However, select, poll, and epoll are essentially synchronous I/O, because they all need to read and write after the Read and Write events are ready, that is, the read and write process is blocked, asynchronous I/O does not need to be read and written by itself, while asynchronous I/O is responsible for copying data from the kernel to the user space.

Select call process:

(1) Copy fd_set from user space to kernel space

(2) register the callback function _ pollwait

(3) traverse all fd and call the corresponding poll method (for socket, this poll method is sock_poll, and sock_poll will call tcp_poll, udp_poll or datagram_poll according to the situation)

(4) taking tcp_poll as an example, the core implementation is _ pollwait, that is, the callback function registered above.

(5) _ the main task of pollwait is to mount current (current process) to the waiting queue of the device. Different devices have different waiting queues. For tcp_poll, the waiting queue is sk-> sk_sleep (note that hanging the process to the waiting queue does not mean that the process is sleep ). After the device receives a message (network device) or fills in the file data (disk device), it will wake up the process of the device waiting for sleep on the queue, then the current will be awakened.

(6) When the poll method returns a mask that describes whether the read and write operations are ready, the fd_set is assigned a value based on the mask.

(7) If a read/write mask is not returned after all fd traversal, schedule_timeout is called to sleep the select process (that is, current. When a device driver reads or writes its own resources, it will wake up the process waiting for sleep in the queue. If a certain timeout value is exceeded (specified by schedule_timeout) and no one wakes up, the process that calls the select statement will be wakened again to obtain the CPU, traverse the fd again, and determine whether the fd is ready.

(8) Copy fd_set from the kernel space to the user space.

Disadvantages of select:

(1) Each time you call the select statement, you need to copy the fd set from the user State to the kernel state. This overhead is very high in many cases of fd.

(2) At the same time, each call to the select statement requires all fd passed in through the kernel traversal. This overhead is also very large in many cases of fd.

(3) The number of file descriptors supported by select is too small. The default value is 1024.

(4) select requires polling for each fd, even if the fd is not ready

So what is the difference between epoll? Let's take a look at the differences between epoll and select and poll. Both select and poll provide only one function-select or poll. Epoll provides three functions: epoll_create, epoll_ctl, and epoll_wait. epoll_create creates an epoll handle. epoll_ctl is the type of event to be monitored, and epoll_wait is the waiting event.

For the first drawback, The epoll solution is included in the epoll_ctl function. Every time a new event is registered to the epoll handle (EPOLL_CTL_ADD is specified in epoll_ctl), all fd files will be copied to the kernel instead of being copied repeatedly during epoll_wait. Epoll ensures that each fd is copied only once throughout the process.

For the second disadvantage, the epoll solution is not like the select or poll solution. Each time, the current is added to the fd device waiting queue in turn, in epoll_ctl mode, the current is mounted once (this time is required) and a callback function is specified for each fd. when the device is ready, it will wake up the waiting person on the queue, this callback function will be called, and this callback function will add the ready fd to a ready linked list ). Epoll_wait is actually used to check whether there is a ready fd in the ready linked list (sleeping for a while using schedule_timeout () to judge the effect for a while, which is similar to Step 1 in select implementation ).

The third disadvantage is that epoll does not have this limit. The FD ceiling supported by epoll is the maximum number of files that can be opened. This number is generally larger than 2048. For example, the size of a machine with 1 GB of memory is about 0.1 million. You can check the number of machines with cat/proc/sys/fs/file-max. Generally, this number has a great relationship with the system memory.

Summary:

(1) The select and poll implementations need to constantly poll all fd sets until the device is ready, during which sleep and wakeup may alternate multiple times. Epoll also needs to call epoll_wait to continuously poll the ready linked list. During this period, sleep and wake up may alternate multiple times. However, when the device is ready, epoll calls the callback function, put the ready fd in the ready linked list and wake up the sleep process in epoll_wait. Although both need to sleep and alternate, the select and poll must traverse the entire fd set while being "Awake, when epoll is "Awake", you only need to judge whether the ready linked list is empty, which saves a lot of CPU time. This is the performance improvement brought about by the callback mechanism.

(2) For select and poll, the fd set must be copied once from the user State to the kernel state, and the current must be mounted once to the device waiting queue, while epoll only needs to be copied once, in addition, the current is only mounted to the waiting queue once (at the beginning of epoll_wait, note that the waiting queue here is not a device waiting queue, but a waiting queue defined inside epoll ). This can also save a lot of expenses.

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.