Talking about network I/O multiplexing Model Select & poll & Epoll

Source: Internet
Author: User

The first thing we need to know about select,poll,epoll is The mechanism of IO multiplexing. I/O multiplexing is a mechanism by which multiple descriptors can be monitored, and once a descriptor is ready (usually read-ready or write-ready), the program can be notified of the appropriate read and write operations. But select,poll,epoll are essentially synchronous I/O , because they all need to be responsible for reading and writing after the read-write event is in place, which means that the read and write process is blocked.

Basic usage of select: http://blog.csdn.net/nk_test/article/details/49256129

Basic usage of poll: http://blog.csdn.net/nk_test/article/details/49283325

Basic usage of Epoll: http://blog.csdn.net/nk_test/article/details/49331717

Next we discuss how to properly use non-blocking I/O multiplexing + poll/epoll.

Let's start with a few common questions:

Generation and processing of 1.SIGPIPE signals

If the client uses close to close the socket The server side calls the write Span style= "Font-family:calibri" >rst segment ( tcp Transport layer); if the server calls again write sigpipe signal, if not ignored, will default to exit the program, obviously does not meet the high availability of the server. signal (Sigpipe, sig_ign).

2. TIME_WAIT state impact on the server

state. client Span style= "font-family: Arial" > call close time_wait state, the kernel will hold Live some resources, greatly reduce the concurrency of the server. Workaround: The protocol is designed so that the client should be actively disconnected, so that the The state is dispersed to a large number of clients. If the client is inactive, some malicious clients are constantly connecting, which consumes server-side connection resources. So the server also has a mechanism to kick off inactive connections.

3. New ACCEPT4 system call

With the flags parameter, you can set the following two flags:

int accept4 (int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags); Sock_nonblock   Set The O_nonblock file status flag on the new open file description.  Using This flag saves  extra  calls to Fcntl (2) to achieve the same result. Sock_cloexec    Set the  close-on-exec (FD_CLOEXEC) flag on the new file descriptor.  See the description of the O_CLOEXEC  flag in open (2) for reasons, this may useful.
When the process is replaced, the file descriptor is turned off, used to set the returned connected sockets, or the FCNTL setting can be used, but the efficiency is slightly lower.

4.Accept (2) returns the processing of Emfile (the file descriptor is exhausted)

(1) Increase the number of process file descriptors
(2) Death
(3) Exit procedure
(4) Turn off the listener socket. So when do you reopen it?
(5) If it is a epoll model, you can use edge trigger instead. The problem is that if you miss an accept (2), the program will never receive a new connection (no state changes)
(6) Prepare a free file descriptor. In this case, first close this free file, get a file descriptor quota; accept (2) Get the file descriptor of the socket, then close (2) gracefully, then the connection to the client is graceful, then the free file is reopened and the "pit" is filled. Used in case of this recurrence.
int idlefd = open ("/dev/null", O_rdonly | O_CLOEXEC); connfd = Accept4 (LISTENFD, (struct sockaddr *) &peeraddr,                 &peerlen, Sock_nonblock | SOCK_CLOEXEC);/*          if (CONNFD = =-1)                err_exit ("Accept4"); */if (connfd = = 1) {    if (errno = = emfile)    {        Close (IDLEFD);        IDLEFD = Accept (LISTENFD, NULL, NULL);        Close (IDLEFD);        IDLEFD = open ("/dev/null", O_rdonly | O_CLOEXEC);        Continue;    }    else        err_exit ("Accept4");}

(i) poll process and issues needing attention


Problems to be aware of and how to deal with them:

(1) Sticky packet problem:read may not be able to connfd the corresponding receive buffer (kernel) of the data are reading , then the next time CONNFD is still active. We should keep the read data in the connfd Application-layer buffer (char buf[1024]) and handle the bounds of the good message.

(2) Write when the amount of data being answered is large, it may not be possible to send all the data to the kernel buffer at one time, so there should be an application layer buffer . adds data that is not sent to the application-tier send buffer.

(3) Focus on the timing of CONNFD's Pollout event. When the Pollout event arrives, the application-layer send buffer data is sent out to send write, and if the application-tier send buffer data is sent, the attention Pollout event is canceled. Pollout Event Trigger condition: CONNFD Send buffer (kernel) is not satisfied (can hold data).

Note: The CONNFD receive buffer (kernel) data is emptied after it is received, and the send buffer (kernel) data segment is emptied when the data segment is signaled and the ACK segment is received. Write simply copies the application-layer send buffer data to the CONNFD corresponding kernel send buffer, and read simply returns the copy of the buffer data from the CONNFD corresponding kernel to the application-level receive buffer.

(ii) Epoll process and issues needing attention level trigger mode:
The basic processing process is very similar to poll. Note that the epoll_wait returns are active, without traversal, and can be processed directly. Write return success simply means copying the data to the kernel buffer. Epollin Events
One of the socket
receive buffers in the kernel is in the empty low-level kernel and a socket receive buffer is not empty high level
epollout Events
One of the socket send buffers in the kernel is not fully low on a socket send buffer in the high-level core
Note: As long as the first write is not complete, the next call to write directly adds the data to the application-tier buffer Outbuffer, waiting for the Epollout event. Edge Trigger Mode:
Disadvantages:

may cause a leaky connection the bug of Accept , difficult to deal with;

When the file descriptor reaches the upper limit, it is always high and no longer triggers. It's also more troublesome to deal with.

recommended reasons for Epoll to use the LT mode:One of them is compatible with poll;the LT (level) mode does not cause a bug to miss the event, but the Pollout event cannot be noticed at first, otherwise there will be a busy loop (that is, there is no data to write yet, but once the connection is established, the kernel send buffer is empty and the Pollout event is triggered). Instead, note that the write cannot be fully written to the kernel buffer, adding data that is not written to the kernel buffer to the application-tier output buffer until the application-level output buffer is finished, stopping the focus on the Pollout event. read and write without waiting for Eagain, can save the number of system calls, reduce latency. (Note: If you use the ET mode, read the eagain when reading, write until the output buffer is finished or written to Eagain)

Note: In the use of ET mode, can write more rigorous, will be LISTENFD set to non-blocking, if the Accpet call has returned, in addition to establish the current connection, can not immediately return to epoll_wait, but also need to continue the loop Accpet, until return 1 and errno = = Eagain just quit. The code examples are as follows:

if (Ev.events & Epollin) {    do    {        struct sockaddr_in stsockaddr;        socklen_t isockaddrsize = sizeof (sockaddr_in);                int iRetCode = Accept (LISTENFD, (struct sockaddr *) &stsockaddr, isockaddrsize);        if (iRetCode > 0)        {           //... Establish connection           //Add event concern        }        else        {//            until eagain occurs, do not continue with accept            if (errno = = Eagain)            {break                ;            }        }    }    while (true);        // ... Other Epollin Events}
(iii) Comparison of all aspects of the three

Select:fd_set has a limit on the number of file descriptors , and each time it is copied to the kernel space to scan the complexity of O (N) . To traverse all the file descriptors to determine if an event has occurred.

Poll: also copy and poll. The list of links copied to the kernel has no limit on the maximum number of connections.

Epoll: Use shared memory (mmap) to reduce replication overhead, storing sockets of interest, all in the kernel state, using the callback mechanism of event notification.




Note: It is not possible to epoll the highest efficiency under any conditions, depending on the actual application situation to determine which I/O to use.

If the number of connected sockets is not large and the sockets are in an active state, then calling the callback function continuously can be inefficient, that is, less than a one-time traversal, where epoll efficiency may be lower than select and poll.




Talking about network I/O multiplexing Model Select & poll & Epoll

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.