UNPv1 the sixth chapter: IO multiplexing Select&poll

Last Update:2016-04-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Some processes require the ability to inform the kernel beforehand, so that once the kernel discovers that one or more of the I/O conditions specified by the process are ready (that is, the input is ready to be read, or the descriptor is able to withstand more output), he notifies the process that this capability is called I/O multiplexing

1.IO Models

5 Basic I/O models

阻塞式I/O非阻塞式I/OI/O复用(select和poll)信号驱动式I/O(SIGIO)异步I/O

An input operation typically consists of two different stages
(1) Waiting for data preparation
(2) Copying data from the kernel to the process
For an input operation on a set of interfaces, the first step is usually to wait for the data to reach the network, when the packet arrives, it is copied to a buffer in the kernel, and the second step is to copy the data from the kernel buffer to the application buffer.
(1) Blocking type I/O
The most popular I/O model is the blocking I/O (blocking I/O) model, where all sockets are blocked by default

The process is blocked from calling Recvfrom to the entire time it returns, and after Recvfrom successfully returns, the application process begins data processing
(2) non-blocking I/O
The process sets a socket to non-blocking in the notification kernel: when the requested I/O operation has to put the process into sleep, it cannot put the process into sleep, but instead returns an error.
No data can be returned at the first three calls to Recvfrom, so the kernel returns a ewouldblock error immediately, and the fourth time a datagram is ready when it is called Recvfrom, it is copied to the application buffer, so recvfrom returns successfully

When an application uses a non-blocking socket, it needs to use a loop to constantly test whether a file descriptor has data readable (called polling, polling). The application keeps polling the kernel to check if I/O operations are ready. This will be a very wasteful operation of CPU resources. This pattern is not very common in use.
(3) IO multiplexing model
With I/O multiplexing, we can invoke select or poll, blocking on one of these two system calls, rather than blocking a real I/O system
We block the select call, wait for the datagram socket to become readable, and when select returns the socket-readable condition, call recvfrom to copy the read data into the application buffer. Another advantage of using Select is that we can wait for multiple descriptors to be ready

(4) signal-driven IO model
You can signal that the kernel sends Sigio signals when the descriptor is ready.

Regardless of the processing of the Sigio signal, the advantage of this model is that the process is not blocked while waiting for the datagram to arrive. The main loop can continue as long as it waits for notification from the signal handler function: either the data is processed or the data is ready to be read
(5) Asynchronous IO Model
The difference between asynchronous I/O and signal-driven I/O is:
A) in the signal-driven I/O mode, the kernel notifies our application when the operation can be manipulated to send Sigio messages.
b) in asynchronous I/O mode, the kernel notifies our application only after all operations have been completed by the kernel operation.

2 Select function

This function allows the process to instruct the kernel to wait for any one of multiple events to occur, and to wake it only if one or more events occur or for a specified amount of time, that is, we call Select to tell the kernel what descriptors (read, write, or abnormal conditions) are interested and how long to wait. Of course, the descriptor of interest is not limited to sockets, and any descriptor can be tested with select

#include <sys/select.h>#include <sys/time.h>int select(intconststruct// 返回: 准备好描述字的正数目, 0 -超时, -1 -出错

We start with the last argument to this function, which tells the kernel how long it will take to wait for any one of the specified descriptors to be ready, structure timeval specifies the number of seconds and microseconds

struct timeval {    long tv_sec;       /* seconds */    long tv_usec;      /* microseconds */};

This parameter is available in the following three possible ways:
A. Always wait: Returns only if there is a descriptive word ready for I/O, so we set the parameter timeout to a null pointer.
B. Wait for a fixed time: when there is a description word ready I/O is returned, but not exceeding the number of seconds and microseconds specified in the TIMEVAL structure referred to by the timeout parameter.
C. Do not wait at all: Returns immediately after the description is checked, which is called polling (polling). To achieve this, the parameter timeout must point to the struct timeval, and the value of the timer (the number of seconds and microseconds specified by the struct timeval) must be 0
In the first two scenarios, the wait is generally interrupted if the process captures a signal and returns from the signal handler.

The middle three parameters Readset,wirteset and Exceptset Specify the descriptor that we want the kernel to test read-write and exception conditions, the parameter MAXFDP1 describes the number of descriptors to be tested, and its value is the largest descriptor to be tested +1

To assign a descriptor set of a Fd_set data type and initialize it with these macros, set or test each bit of the collection, there are four macro functions:
void Fd_zero (Fd_set * fdset); /* Clear all bits in Fdset */
void Fd_set (int FD, fd_set * fdset); /* Turn on the bit for FD in Fdset */
void fd_clr (int FD, fd_set * fdset); /* Turn on the bit for FD in Fdset */
int fd_isset (int FD, fd_set * fdset); /* is the bit for FD on Fdset */

Descriptor Readiness Condition
1). If you meet any of the following four criteria, a socket is ready to read:
A. Socket receive buffer with data bytes greater than or equal, socket receive buffer low watermark, you can set low watermark with So_rcvlowat socket option, default value is 1 for TCP and UDP sets by word
B. The read half of the connection is closed (the TCP connection to Fin is received). Returns 0 (EOF) for such a socket read operation
C. The socket is a listening socket and the number of connections completed is not 0. The accept of such a set of words is usually not blocked
D. There is a socket error to be processed. The read operation of such a set of words will not block and return 1 (error), while the errno is set to an error condition, these pending errors can also be obtained by specifying the SO_ERROR socket option call getsockopt.
2). If any of the following four conditions are met, a socket is ready to write:
A. The number of available bytes in the socket send buffer is greater than or equal to the current size of the socket send buffer low watermark. And either the socket is connected, or the socket does not need to be connected (UDP), if we set the socket to non-blocking, the write operation will not block and return a positive value. You can use So_ Sndlowat sets a low water mark for the socket. The default value for TCP and UDP is typically 2048.
B. Write half of the connection is closed. The write operation for such a socket will produce a sigpipe signal.
C. A socket with a non-blocking connect is already established, or connect has failed.
D. There is a socket error on the handle. Writes to such a socket will return-and, if the error is set to a fault condition, can be obtained and cleared by invoking getsockopt by specifying the So_error set of options.
3). If a socket exists with out-of-band data or is still in out-of-band markup, it has an exception condition to be processed

3 shutdown function

The normal way to terminate a network connection is to call close, but close has two limitations that can be avoided by a function shutdown:
1). Close sets the access count of the description Word by 1, and closes the socket interface only if this count is 0 o'clock. With shutdown we can fire the normal connection termination sequence of TCP, regardless of the access count.
2). Close terminates the two directions of data transfer: Read and write. Since the TCP connection is full-duplex, there are many times when we have to notify the other end that we have finished sending the data, even though there is still a lot of data to be sent at that end.

#include <sys/socket.h>int shutdown(intint// 返回: 0-成功, -1-出错

The function's behavior depends on the value of the HOWTO parameter:
shut_rd– closing a socket's read data direction connection
shut_wr– closing a socket's write data direction connection
shut_rdwr– Closing a socket bidirectional connection

4 Pselect function

#include <sys/select.h>#include <signal.h>#include <time.h>int pselect(int maxfdp1, fd_set * readset, fd_set * writeset, fd_set * exceptset,             conststructconst//返回: 准备好描述字的个数, 0-超时 -1-出错

Pselect Two changes relative to select
1). The Pselect function uses the TIMESPEC structure, which supports nanosecond

struct timespec{     time_t tv_sec;    // seconds     long tv_nsec;     // nanoseconds};

2). The Pselect function adds a sixth function: A pointer to the signal mask

5 Poll function

Poll provides features similar to select, but it also provides additional information when it comes to streaming devices

#include <poll.h>int poll(structunsignedlongint// 返回: 准备好描述字的个数, 0-超时, -1-出错

The first parameter is a pointer to the first element of an array of structures, each of which is a POLLFD structure that specifies some conditions for testing a given descriptive word FD.

struct pollfd{     int fd;              /* descriptor to check */     short events         /* events of interest on fd */     short revents        /* events that occurred on fd */};

Parameter Nfds describes the number of descriptive words we care about, the time the parameter timeout waits, in milliseconds

UNPv1 the sixth chapter: IO multiplexing Select&poll

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

UNPv1 the sixth chapter: IO multiplexing Select&poll

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support