In order to handle a large number of customer connection requests when the Linux socket server is short programming, the use of nonblocking I/O and multiplexing is required, select, poll, and Epoll are the I/O multiplexing provided by the Linux API, and since the addition of Epoll to Linux 2.6, In the field of high-performance server has been widely used, now more famous nginx is the use of epoll to achieve I/O multiplexing support high concurrency, currently in high concurrency scenarios, nginx more and more welcome. Here is a reference to the article. Nginx becomes the most popular Web server for the global Top1000 website.
According to W3techs July 3 statistics show that in the global Top 1000 of the site, 34.9% of the site is using Nginx, which makes nginx beyond the Apache, become the high-traffic web site The most trusted Web server. is statistical data.
Select
The following is the function interface for select:
int select (int n, fd_set *readfds, Fd_set *writefds, Fd_set *exceptfds, struct timeval *timeout);
The Select function monitors file descriptors in 3 categories, Writefds, Readfds, and Exceptfds, respectively. After the call, the Select function blocks until a description is ready (with data readable, writable, or except), or timed out (timeout Specifies the wait time, and if the return is set to null immediately), the function returns. When the Select function returns, you can find the ready descriptor by traversing Fdset.
Select is currently supported on almost all platforms, and its good cross-platform support is one of its advantages. A disadvantage of select is that the maximum number of file descriptors that a single process can monitor is 1024 on Linux, which can be improved by modifying the macro definition or even recompiling the kernel, but this also results in a decrease in efficiency.
Poll
int poll (struct POLLFD *fds, unsigned int nfds, int timeout);
Unlike select, which uses three bitmaps to represent three Fdset, poll is implemented using a POLLFD pointer.
struct POLLFD {int fd;/* file descriptor */short events;/* Requested events to watch */short revents;/* Returned events witnessed */};
The POLLFD structure contains the event to be monitored and the event that occurred, no longer using the Select "parameter-value" delivery method. At the same time, POLLFD does not have the maximum number of limits (but the performance will also decrease if the number is too large). As with the Select function, poll returns, you need to poll the POLLFD to get the ready descriptor.
From the above, select and poll need to traverse the file descriptor to get a ready socket after returning. In fact, a large number of clients connected at the same time may only be in a very small state of readiness at a time, so their efficiency will decrease linearly as the number of monitored descriptors increases.
Epoll:
The Epoll interface is as follows:
int epoll_create (int size), int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event); typedef Union EPOLL_DAT A { void *ptr; int FD; __uint32_t u32; __uint64_t u64; } epoll_data_t; struct Epoll_event { __uint32_t events; /* Epoll Events */ epoll_data_t data; /* User Data variable * /};int epoll_wait (int epfd, struct epoll_event * events, int maxevents, int timeout);
The main epoll_create,epoll_ctl and epoll_wait are three functions. The epoll_create function creates a Epoll file descriptor, and the parameter size does not limit the maximum number of descriptors that epoll can listen to, but is a recommendation for the kernel to initially allocate internal data structures. The return is a Epoll descriptor. -1 indicates that the creation failed. The epoll_ctl controls the OP operation on the specified descriptor FD, which is the listener event associated with the FD. There are three OP operations: Add Epoll_ctl_add, delete Epoll_ctl_del, modify Epoll_ctl_mod. Add, delete, and modify the listener events for FD, respectively. epoll_wait waits for an IO event on EPFD and returns up to Maxevents events.
In Select/poll, the kernel scans all monitored file descriptors only after a certain method is called, and Epoll registers a file descriptor beforehand with Epoll_ctl (), once it is ready based on a file descriptor, The kernel uses a callback mechanism like callback to quickly activate the file descriptor and be notified when the process calls Epoll_wait ().
The main advantages of epoll are a few aspects:
1. The number of monitored descriptors is unrestricted, it supports the maximum number of open files, this number is generally far greater than 2048, for example, in the 1GB memory of the machine is about 100,000, the specific number can be cat/proc/sys/fs/file-max to see, In general, this number is very much related to system memory. The biggest disadvantage of select is that there is a limit to the number of FD that the process opens. This is not sufficient for servers with a larger number of connections. Although it is possible to choose a multi-process solution (as Apache does), although the cost of creating a process on Linux is relatively small, it is still not negligible, and data synchronization between processes is far less efficient than synchronization between threads, so it is not a perfect solution.
2. The efficiency of IO does not decrease as the number of monitored FD increases. Epoll is not the same as select and poll polling, but is implemented by each FD-defined callback function. Only the ready FD will execute the callback function.
3. Support for level triggering and edge triggering (just tell the process which file descriptor has just become ready, it only says it again, if we do not take action, then it will not be told again, this way is called edge triggering) two ways, the theoretical edge trigger performance is higher, but the code implementation is quite complex.
4.mmap accelerates the transfer of information between the kernel and user space. The Epoll is mmap the same piece of memory through the kernel in the user space, avoiding the fearless memory copy.
The difference between select poll and Epoll in Linux