Introduction of SELECT Poll Epoll model in Linux I/O multiplexing and comparison of its advantages and disadvantages

Source: Internet
Author: User
Tags bitwise epoll

About I/O multiplexing:

I/O multiplexing (also known as "event-driven"), the first thing to understand is that the operating system provides you with a feature that can give you a notification when one of your sockets is readable or writable. This way, when used with a non-blocking socket, only if the system notifies me which descriptor is readable, I go to the read operation, can guarantee that every read can read the valid data without doing the pure return-1 and eagain of useless. Write operations are similar. This functionality of the operating system is implemented through system calls such as Select/poll/epoll, which can monitor the read-write readiness of multiple descriptors at the same time, so that I/O operations of multiple descriptors can be done alternately sequentially within a thread, which is called I/O multiplexing, The "reuse" here refers to reusing the same thread.

I. I/o multiplexing Select

1. Introduction:
The purpose of the select system call is to listen for readable, writable, and anomalous events on the file descriptors that are of interest to the user for a specified period of time. Poll and select should be categorized as system calls that can block simultaneous detection of a set of non-blocking IO devices, until a device triggers an event or exceeds a specified wait time-that is, their duty is not to do Io, but to help the caller find the device that is currently ready.
Here is the schematic of select:

2. The select System invokes the API as follows:

#include <sys/time.h>#include <sys/types.h>#include <unistd.h>intselect(int*readfds*writefds*exceptfds*timeout);

The Fd_set struct is a set of file descriptors, which is actually an integer array, each of which is marked with a file descriptor for each element in the array. The number of file descriptors that can be accommodated by Fd_set is specified by fd_setsize, Fd_setsize equals 1024 in general, which limits the total amount of file descriptors that select can process at the same time.

3, the following describes the meaning of each parameter:
1) The Nfds parameter specifies the total number of file descriptors that are being monitored. It is usually set to the maximum value of all file descriptors in select Listener plus 1;
2) Readfds, Writefds, and Exceptfds respectively refer to the set of file descriptors corresponding to the events such as readable, writable, and abnormal. These three parameters are passed in to the outgoing parameter, referring to the user before the call to select, users will be interested in the readable, writable, or abnormal file descriptor through Fd_set (described below) The function is added into the Readfds, Writefds, Exceptfds file descriptor set, Select will listen for the file descriptors in these file descriptor sets, and if there is a ready file descriptor,Select resets the Readfds, Writefds, Exceptfds file description Fu Grulai notifies the application which file descriptors are ready. This feature will cause the Select function to return, before calling select again, you must reset the file descriptor we care about , that is, three file descriptions descriptor is not the one we passed in before.
3) The timeout parameter is used to specify the time-out for the Select function (which is also discussed below for the Select return value).

struct timeval{    long tv_sec;        //秒数    long tv_usec;       //微秒数};

4. The following functions (macro implementations) are used to manipulate the file description descriptor:

void  fd_set (   int  fd, fd_set *set ); //set file descriptor fd  in Set void    FD_CLR (int  fd, fd_set *set ); //clears the FD bit in set   int  Fd_isset (int  fd, fd_set *set ); //determine if a file descriptor is set in FD  void           Fd_zero (fd_set *set ); //empties all the bits in the set (should be emptied before using the file descriptor set)  (note the difference between FD_CLR and Fd_zero, one is to clear one, one is to clear all bits) 

5, the Return of select:
1) If you specify timeout as Null,select will always wait until a file descriptor is ready, select returns;
2) If the specified time of timeout is 0,select not wait at all, return immediately;
3) If a fixed time is specified, the Select function returns if the specified file descriptor is ready for that period of time, and select returns if the specified time is exceeded.
4) return value condition:
A) within the timeout period, if the file descriptor is ready, select returns the total number of file descriptors ready (including readable, writable, and abnormal), and if no file descriptor is ready, select returns 0;
b) When the Select call fails, returns 1 and sets errno, if a signal is received, select returns 1 and sets errno to Eintr.

6. The file descriptor's readiness condition:
In Network programming,
1) The socket is readable in the following cases:
A) The socket core receives a buffer of bytes greater than or equal to its low water mark So_rcvlowat;
b) A new connection request is on the monitor socket;
c) There is an unhandled error on the socket.
2) sockets can be written in the following cases:
A) The number of bytes available in the socket core send buffer is greater than or equal to its low water mark So_sndlowat;
b) After the socket has been successfully connected, there are unhandled errors on the socket;

II. Poll of I/O multiplexing

1. The poll system call principle is similar to the prototype and select, and also polls a certain number of file descriptors within a specified time to test if there is a ready person.

2. The poll system call API is as follows:

#include <poll.h>int poll(structint timeout);

3, the following describes the meaning of each parameter:
1) The first parameter is a pointer to the first element of an array of structures, each of which is a POLLFD structure that specifies the criteria for testing a given descriptor.

struct pollfd{    int fd;             //指定要监听的文件描述符    short events;       //指定监听fd上的什么事件    short revents;      //fd上事件就绪后,用于保存实际发生的时间};

The events to be monitored are specified by the events member, and the function returns the state of the descriptor in the corresponding Revents member (each file descriptor has two events, one is an incoming event, and one is an outgoing revents, thus avoiding the use of the passed-in parameters. Note the difference from select, which tells the application what events actually occurred on the FD. Events and revents can be bitwise OR of multiple events.
2) The second parameter is the number of file descriptors to listen to, that is, the number of elements of the array FDS;
3) The third parameter has the same meaning as select.

4. Poll Event Type:

When you use Pollrdhup, you define the _gnu_source at the beginning of the code

5, the return of poll:
Same as SELECT.

Iii. Epoll of I/O multiplexing

1. Introduction:
Epoll and select and poll differ greatly in their use and implementation. First, Epoll uses a set of functions instead of a single function, and second, epoll the events on the user's care file descriptor into an event table in the kernel, without having to repeat the incoming file Descriptor collection event set for each call to select and poll.

2. Create a file descriptor that specifies the event table in the kernel:

#include<sys/epoll.h>int epoll_create(int size);    //调用成功返回一个文件描述符,失败返回-1并设置errno。

The size parameter doesn't work, just give the kernel a hint to tell it how big the event table needs to be. The file descriptor returned by the function specifies the kernel event table to access, and is the handle to all other Epoll system calls.

3. Operation Kernel Event table:

#include<sys/epoll.h>int epoll_ctl(intintintstruct epoll_event *event);    //调用成功返回0,调用失败返回-1并设置errno。

EPFD is the file handle returned by Epoll_create, identifies the event table, and the OP specifies the type of operation. The following 3 types of operations are available:

a)EPOLL_CTL_ADD, 往事件表中注册fd上的事件;b)EPOLL_CTL_MOD, 修改fd上注册的事件;c)EPOLL_CTL_DEL, 删除fd上注册的事件。

The event parameter specifies the events, and the epoll_event is defined as follows:

struct epoll_event{    __int32_t events;       //epoll事件    epoll_data_t data;      //用户数据};typedefunion epoll_data{    void *ptr;    int  fd;    uint32_t u32;    uint64_t u64;}epoll_data;

When using Epoll_ctl, the FD is added, modified to the kernel event table, or the FD event is removed from the kernel timesheet. If you add an event to the event table, you can add event events to the FD in data, or use the FD in data, and put the FD in the memory referred to by the user data ptr (because Epoll_data is a consortium, only one of the data is used), and then events is set.

3. epoll_wait function
The most critical function of a epoll system call is epoll_wait, which waits for an event on a group file descriptor for a period of time.

#include<sys/epoll.h>int epoll_wait(intstructintint timeout);    //函数调用成功返回就绪文件描述符个数,失败返回-1并设置errno。

The timeout parameter and select are the same as poll, specifying a time-out period, maxevents specifies the maximum number of events to listen on, and events is an outgoing parameter, epoll_wait function if event readiness is detected, Copies all the ready events from the Kernel event table (the file referred to in EPFD) to the array specified in events. This array is used to output epoll_wait detected readiness events, unlike select and poll, which is the biggest difference between the Epoll and the former, which is also said when comparing the differences between the three.

Comparison of four or three groups of I/O multiplexing functions

Same point:
1) All three need to register the user's concern on the FD;
2) All three need a timeout parameter to specify the time-out period;
different points:
1) Select:
A) Select specifies three sets of file descriptors, namely, readable, writable, and anomalous events, so that all possible events cannot be distinguished in more detail;
b) Select if a ready event is detected, it is changed on the original file descriptor to tell the application what time it occurred on the file descriptor, so when you call select again, you must first reset the file descriptor;
c) Select takes the form of polling for all registered file descriptor sets, returning the entire user-registered collection of events, so the time complexity of the application index-ready file is O (n);
d) The maximum number of file descriptors that a select allows to listen to is usually limited, typically 1024, if the performance of the 1024,select is significantly lower;
e) can only work in the LT mode.

2) Poll:
A) poll the file descriptor and event binding, and the event can be specified separately, and can be a bitwise OR of multiple events, so that the registration of the event is more granular, and poll alone with an element to save the results of the ready to return, so that the next time you call poll, you do not have to reset the previously registered events;
b) poll uses a set of polling for all registered file descriptors, which returns the entire user-registered collection of events, so the time complexity of the application index-ready file is O (n).
C) poll uses the Nfds parameter to specify the maximum number of file descriptors and events to listen to, which can reach the maximum file descriptor that the system allows to open, which is 65535.
D) can only work in the LT mode.

3) Epoll:
A) Epoll the user registration of the file descriptor and events into the event table in the kernel, provides a separate system call Epoll_ctl to manage the user's events, and Epoll in the form of callbacks, once the registered file descriptor is ready, the trigger callback function, The callback function copies the ready file descriptor and events to the memory managed by the user space events, so that the time complexity of the application index-ready file reaches O (1).
b) Epoll_wait uses maxevents to develop the maximum number of file descriptors and events that can be opened by the system, that is, 65535 of the maximum file descriptors allowed.
c) Not only works in the LT mode, but also supports the ET efficient mode (ie, epolloneshot, the reader can check the event type by himself)

Select/poll/epoll Summary:

Introduction of SELECT Poll Epoll model in Linux I/O multiplexing and comparison of its advantages and disadvantages

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.