LINUXI/O multiplexing

Source: Internet
Author: User
Tags epoll int size readable

I/O multiplexing uses a mechanism that can monitor multiple descriptors and, once a descriptor is ready (usually read-ready or write-ready), notifies the program to read and write. The I/O multiplexing technique is designed to address the technology that occurs when a process or thread blocks to an I/O system call, so that the process does not block a particular I/O system call.

I/O multiplexing Select

The function allows the process to instruct the kernel to wait for any one of several events to be sent, and to wake only if one or more events occur or after a specified amount of time has passed.

Select function

1.1 Need header File

#include <sys/select.h> #include <sys/time.h> #include <sys/types.h> #include <unistd.h/>

1.2 Declaration and return value

1. Disclaimer

int select (int Nfds, fd_set *readfds, Fd_set *writefds, Fd_set *exceptfds, struct timeval *timeout);

2. Return value

Success: Number of ready descriptors, time-out returned 0.

Error:-1.

1.3 Features

Monitors and waits for multiple file descriptor property changes (readable, writable, or error exceptions). The Select () function monitors file descriptors in 3 categories, Writefds, Readfds, and Exceptfds, respectively. After the call, the Select () function blocks until a descriptor is ready (with data readable, writable, or with an error exception), or timeout (timeout specifies the wait time) before the function returns. When the Select () function returns, you can find the ready descriptor by traversing Fdset.

1.4 Parameters

1. Nfds: To monitor the scope of the file descriptor, generally take the maximum number of monitored descriptors +1, such as write here 10, so that the descriptor 0, 1, 2 ... 9 will be monitored, and the maximum value on Linux is typically 1024.

2. READFD: A set of readable descriptors for monitoring, which is stored as long as the file descriptor is about to be read.

3. Writefds: A collection of writable descriptors for monitoring.

4. Exceptfds: Monitor the error exception descriptor collection.

5. Timeout tells the kernel how long it will take to wait for any one of the specified descriptors to be ready. Its timeval structure is used to specify the number of seconds and microseconds for this time period.

struct Timeval{long tv_sec;  Secondslong tv_usec; microseconds};

Timeout can set the value:

1. Set this parameter to null pointer. means to wait forever and return when there is a descriptive word ready for I/O.

2. Set this parameter to a value that specifies the number of seconds in the timeval structure and the number of microseconds. Indicates that the wait specifies a time-out, and returns when the I/O is not ready for the description word after the timeout.

3. Set this parameter to a value that specifies the number of seconds and microseconds in the timeval structure, and the number of seconds and microseconds is 0. Indicates that the description word is returned immediately after I have prepared I/O, which is called polling.

1.5 Fd_set

Fd_set can be understood as a collection in which the file descriptor is stored and can be set by the following four macros:

1. void Fd_zero (Fd_set *fdset);  Clears the collection 2.   void Fd_set (int FD, fd_set *fdset);  Adds a given file descriptor to the collection 3.   void fd_clr (int FD, fd_set *fdset);  Removes a given file descriptor from the collection by 4.   int fd_isset (int FD, fd_set *fdset); Checks whether the specified file descriptor in the collection can be read-write

Select Pros and cons

2.1 Advantages

Select () is currently supported on almost all platforms, and its good cross-platform support is one of its advantages.

2.2 Disadvantages

1, each call to select (), you need to copy the FD set from the user state to the kernel state, the cost of FD is very large, and each call to select () will need to traverse the kernel traversal of all the FD, the cost of the FD is also very large.

2, a single process can monitor the number of file descriptors is the maximum limit, on Linux is generally 1024, you can modify the macro definition or even recompile the kernel way to promote this limit, but this will also lead to efficiency reduction.

I/O multiplexing poll

The essence of Select () and poll () system calls, the former is introduced in BSD Unix, which is introduced in System v. The mechanism of poll () is similar to select (), which is not substantially different from select (), that managing multiple descriptors is also polling and processing according to the state of the descriptor, but poll () does not have a limit on the maximum number of file descriptors (but the performance is also degraded if the number is too large). The disadvantage of poll () and select () is that an array containing a large number of file descriptors is copied in between the user state and the kernel's address space, regardless of whether the file descriptor is ready, and its overhead increases linearly as the number of file descriptors increases.

Poll function

1.1 Need header File

#include <poll.h/>

1.2 Declaration and return value

1. Disclaimer

int poll (struct POLLFD *fds, nfds_t nfds, int timeout);

2. Return value

On success, poll () returns the number of file descriptors that are not 0 in the Revents field in the struct, or poll () returns 0 if no events occur before the timeout;

On Failure, poll () returns 1 and sets errno to one of the following values:

EBADF: Invalid file descriptor specified in one or more structs.

The address that the Efault:fds pointer points to exceeds the address space of the process.

EINTR: The requested event generates a signal before the call can be re-initiated.

The Einval:nfds parameter exceeds the Plimit_nofile value.

Enomem: There is not enough memory available to complete the request.

1.3 Features

Monitors and waits for multiple file descriptor property changes.

1.4 Parameters

1. FDS differs from select () using three bitmaps to represent three Fdset, poll () is implemented using a POLLFD pointer. An array of POLLFD structures that includes the file descriptor and events you want to test, which are determined by event domain events in the structure, and the actual time after the call is filled in the revents domain of the struct.

struct Pollfd{int fd;   File descriptor short events;  Wait for the event short revents; The actual occurrence of the event};

FD Each pollfd struct specifies a monitored file descriptor that can pass multiple structures, indicating that poll () monitors multiple file descriptors.

Events: The events field of each struct is the event mask that monitors the file descriptor, which is set by the user. The mask value of the events wait event is as follows:

Processing input:

Pollin Normal or priority with data-readable

Pollrdnorm Plain Data readable

Pollrdband priority with data-readable

Pollpri high-priority data readable

Processing output:

Pollout Normal or priority band data writable

Pollwrnorm General data can be written

Pollwrband priority with data writable

Handling Errors:

Pollerr Error occurred

Pollhup occurrence hangs

Pollval Description Word is not an open file

Poll () deals with three levels of data, normal normal, priority band, and high-priority higher priorities, which are all out of stream implementations.

Pollin | Pollpri is equivalent to the Read event of select ().

Pollout | Pollwrband is equivalent to the Write event of select ().

Pollin equivalent to Pollrdnorm | Pollrdband.

Pollout is equivalent to Pollwrnorm.

For example, to monitor whether a file descriptor is readable and writable at the same time, we can set events to Pollin | Pollout.

The revents field is the action result event mask of the file descriptor, which is set by the kernel when the call returns. Any events requested in the events domain may be returned in the revents domain. The events domain of each struct is set by the user, telling the kernel what we are concerned about, and the revents domain is set at the time of return, to indicate what happened to the descriptor.

2. Nfds is used to specify the number of elements in the first parameter array.

3. Timeout: Specifies the number of milliseconds to wait.

If timeout is set to the number of milliseconds to wait, poll () will return regardless of whether I/O is ready.

If timeout is set to 0 o'clock, the poll () function returns immediately.

If timeout is set to-1, poll () is blocked until a specified event occurs.

I/O multiplexing Epoll

The Epoll is presented in the 2.6 kernel and is an enhanced version of the previous select () and poll (). Compared to select () and poll (), Epoll is more flexible and has no descriptor restrictions. Epoll uses a file descriptor to manage multiple descriptors, storing the event of the user-relationship's file descriptor in an event table in the kernel, which is only needed once for the user-space and kernel-space copy.

Need header File

#include <sys/epoll.h>

Statement

int epoll_create (int size), int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event); int epoll_wait (int epfd, str UCT epoll_event * events, int maxevents, int timeout);

Epoll_create function

int epoll_create (int size);
3.1 Features

This function generates a epoll dedicated file descriptor (creates a epoll handle).
3.2 Parameters

The size is used to tell the kernel how large the number of listeners is, and the parameter size does not limit the maximum number of descriptors that epoll can listen to, just a suggestion for the kernel to initially allocate internal data structures.

Since Linux 2.6.8, the size parameter has been ignored, meaning that it can fill any value that is greater than 0. It should be noted that when the Epoll handle is created, it will occupy an FD value, under Linux if the view/proc/process id/fd/, is able to see the FD, so after the use of Epoll, you must call Close (), or it may cause FD to be exhausted.

3.3 Return value

Success: Epoll dedicated file descriptor

Failed:-1

Epoll_ctl function

int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event);

4.1 Features

The event registration function of Epoll, unlike select (), is to tell the kernel what type of event to listen to when it listens to events, but to register the type of event to listen on first.

4.2 Parameters

1. EPFD epoll dedicated file descriptor, Epoll_create () return value

2. Op represents an action and is represented by three macros:

Epoll_ctl_add: Register the new FD to EPFD; Epoll_ctl_mod: Modify the Listener event of the registered FD; Epoll_ctl_del: Deletes a FD from EPFD;

3. The file descriptor that FD needs to listen on

4. event tells the kernel what events to listen to, struct epoll_event structure as follows:

Save data related to a file descriptor that triggered the event (related to how it was used) typedef union epoll_data {void *ptr;int fd;__uint32_t u32;__uint64_t u64;} epoll_data_ t;//events of interest and events that are triggered struct epoll_event {__uint32_t events;/* Epoll events */epoll_data_t data;/* USER data variable */};

Events can be a collection of several macros:

Epollin: Indicates that the corresponding file descriptor can be read (including a graceful shutdown of the peer SOCKET);

Epollout: Indicates that the corresponding file descriptor can be written;

Epollpri: Indicates that the corresponding file descriptor has an urgent data readable (this should indicate the arrival of out-of-band data);

Epollerr: Indicates an error occurred in the corresponding file descriptor;

Epollhup: Indicates that the corresponding file descriptor is hung up;

Epollet: Set Epoll to edge triggered mode, which is relative to the horizontal trigger (level triggered).

Epolloneshot: Listen to only one event, when the event is monitored, if you still need to continue to listen to the socket, you need to add the socket to the Epoll queue again

4.3 return value

Success: 0

Failed:-1

epoll_wait function

int epoll_wait (int epfd, struct epoll_event * events, int maxevents, int timeout);

5.1 Features

Waits for the event to be generated, collecting events that have been sent in epoll monitored events, similar to a select () call.

5.2 Parameters

1. EPFD epoll dedicated file descriptor, Epoll_create () return value

2. events allocates an array of epoll_event structures, and Epoll will assign the event to the events array (events cannot be null pointers, the kernel is only responsible for copying the data into the events array, Not going to help us allocate memory in the user state).

3. maxevents maxevents The core of this events how much.

4. Timeout time-out.

If timeout is set to the number of milliseconds to wait, it will be returned regardless of whether I/O is ready.

If timeout is set to 0 o'clock, the function returns immediately.

If timeout is set to-1, it is blocked until a specified event occurs.

5.3 Return value

Success: Returns the number of events that need to be processed, such as returning 0 to indicate a time-out.

Failed:-1

Lt mode and et mode

Epoll operates on file descriptors in two modes: LT (Level Trigger) and ET (Edge trigger). The LT mode is the default mode.

6.1 LT mode

When Epoll_wait detects that a descriptor event occurs and notifies the application of this event, the application can not process the event immediately. The next time you call Epoll_wait, the application will respond again and notify you of this event.

6.2 et mode

When Epoll_wait detects that a descriptor event occurs and notifies the application of the event, the application must handle the event immediately. If it is not processed, the next time you call Epoll_wait, the application will not respond again and notify this event.

Comparison between 6.3 lt mode and et mode

The ET mode greatly reduces the number of times the Epoll event is repeatedly triggered, so the efficiency is higher than the LT mode. Epoll working in the ET mode, the non-blocking socket interface must be used to avoid the task of handling multiple file descriptors starve due to a blocking read/block write operation on a file handle.

Advantages of Epoll

1, in Select/poll, the process only after the invocation of a certain method, the kernel will be all the monitored file descriptor scan, and epoll () in advance through EPOLL_CTL () to register a file descriptor, once based on a file descriptor is ready, The kernel uses a callback-like callback mechanism (software interrupts) to quickly activate the file descriptor and be notified when the process calls Epoll_wait ().

2, the monitoring of the number of descriptors is unrestricted, it supports the maximum number of FD can open file, this number is generally far greater than 2048, for example, in the 1GB memory of the machine is about 100,000, the specific number can be cat/proc/sys/fs/file-max to see, In general, this number is very much related to system memory. The biggest disadvantage of select () is that there is a limit to the amount of FD open by the process. This is not sufficient for servers with a larger number of connections. Although it is possible to choose a multi-process solution (as Apache does), although the cost of creating a process on Linux is relatively small, it is still not negligible, and data synchronization between processes is far less efficient than synchronization between threads, so it is not a perfect solution.

3. The I/O efficiency will not decrease with the increase in the number of monitoring FD. Select (), the poll () implementation needs to constantly poll all FD collections until the device is ready, and may have to sleep and wake multiple times alternately. Epoll, in fact, also needs to call Epoll_wait () to continually poll the Ready list, during which time there may be multiple sleep and wake alternates, but when it is device ready, call the callback function, put ready FD into the Ready list, and wake the process into sleep in epoll_wait (). While both sleep and alternate, select () and poll () traverse the entire FD collection while "Awake", while Epoll is "awake" as long as it is OK to determine if the ready list is empty, which saves a lot of CPU time. This is the performance boost that the callback mechanism brings.

4, select (), poll () each call to the FD set from the user state to the kernel state copy once, and Epoll as long as a copy, which can save a lot of overhead.


LINUXI/O multiplexing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.