Classic multithreaded and select modes are no longer applicable when writing high load server programs that connect a large number of connections. You should discard them and use epoll/kqueue/dev_poll to capture I/O events. Finally, the AIO is briefly introduced.
network services in processing tens of thousands of client connections, often inefficient or even completely paralyzed, which is called the c10k problem.
With the rapid development of the Internet, more and more network services are beginning to face c10k problem, as a large web site developers need to have a certain understanding of the c10k problem. The main reference document of this paper is http://www.kegel.com/c10k.html. The biggest characteristic of the c10k problem is: The design is not good enough, its performance and connection number and machine performance relations are often non-linear. For example: If the c10k problem is not considered, a classic select based program can handle 1000 concurrent throughput well on the old server, and it often does not handle concurrent 2000 throughput at twice times the performance of the new server. This is because the consumption of a large number of operations is linearly correlated with the current number of connections at the time of the policy. Will cause the resource consumption of a single task and the current number of connections to be O (n). While the service program needs to process tens of thousands of sockets for I/O processing, the accumulated resource consumption will be considerable, which will obviously lead to system throughput can not match machine performance. To solve this problem, you must change the policies that provide services to the connection. There are two main strategies: 1. How the application software cooperates with the operating system to obtain I/O events and Schedule I/O operations on multiple sockets; 2. How the application handles task and thread/process relationships. The former consists of 3 scenarios of blocking I/O, non-blocking I/O, and asynchronous I/O, which are mainly 1 processes per task, 1 threads per task, single-threaded, multitasking shared thread pools, and some more complex variant scenarios. The classic strategies commonly used are as follows:
1. Serve one client with each thread/process, and using blocking I/o This is a common strategy for applets and Java, and is also a popular choice for interactive long connection applications (such as BBS). This strategy is very difficult to meet the requirements of high-performance programs, the advantage is extremely simple, easy to embed complex interactive logic. Apache, FTPD and so on are all this kind of work mode.
2. Serve many clients with single thread, and use nonblocking I/O and Readiness notification This is a classic model, datapipe and other programs are like Implemented by this. The advantage is that it is simpler, easier to migrate, and provides sufficient performance; The disadvantage is that it is not possible to make full use of multiple CPUs. In particular, the program itself does not have complex business logic.
3. Serve many clients with each thread, and use nonblocking I/O and Readiness notification simple improvements to classic Model 2, with the disadvantage of being easy to use in multithreaded concurrency Bugs are on, and even some OS do not support multithreaded operations readiness notification.
4. Serve many clients with each thread, and use asynchronous I/O can provide considerable performance on the OS with AI/O support. However, the AI/O programming model differs considerably from the classical model, which makes it difficult to write a framework that supports both AI/O and classic models, and reduces the portability of programs. On Windows, this is basically the only option available.
This article mainly discusses the details of model 2, that is, how the application software handles socket I/O under Model 2.
Select and poll
The typical process for the most original synchronous blocking I/O models is as follows:
Typical processes for synchronizing blocking I/O models from an application perspective, the read call lasts a long time and the application requires a considerable number of threads to resolve concurrent access problems. Synchronous non-blocking I/O has improved this: Classic single-threaded server programs often have the following structure:
C Code do { Get readiness notification of all sockets Dispatch ready handles to corresponding handlers If (readable) { read the socket if (read done) Handler process the request } if (writable) write response if (nothing to do) close socket } while (True)
Typical processes for asynchronous blocking I/O models
The key part is readiness notification to find out which socket has an I/O event above it. In general, the first thing learned from textbooks and example programs is to use Select to achieve. Select is defined as follows: int select (int n, fd_set *rd_fds, Fd_set *wr_fds, Fd_set *ex_fds, struct timeval); Select uses the Fd_set structure, from the man page can know Fd_set can accommodate the handle and fd_setsize related. In fact, Fd_set is an array of bit markers under *nix, and each bit indicates that the corresponding subscript fd is in the fd_set. Fd_set can only hold those handles whose number is less than fd_setsize. Fd_setsize default is 1024, if you put too large a handle into the Fd_set, the program will collapse after the array is out of bounds. The system defaults to limit the maximum handle number of a process to no more than 1024, but it can be extended by Ulimit-n command/setrlimit function. If it is unfortunate that a program compiles in a fd_setsize=1024 environment and encounters ulimit–n > 1024 at run time, it is only a blessing that God will not fail. In the ACE environment, Ace_select_reactor specifically protects against this, but there are recv_n such functions indirectly using the Select, which requires attention. For the Fd_set problem, *nix provides a poll function as a substitute for select. The poll interface is as follows: Int poll (struct POLLFD *ufds, unsigned int nfds, int timeout); The 1th parameter UFDs is a POLLFD array provided by the user, and the size of the array is determined by the user, thus avoiding the trouble caused by the fd_setsize. UFDs is a complete replacement for Fd_set, which is easy to transplant from select to poll. So far, at least we face c10k, we can write a program that can work.
However, the select and poll performance drops dramatically as the number of connections increases. There are two reasons for this: first, the operating system will need to re-establish a list of care events for the current thread and hang the thread on the complex wait queue for each select/poll operation, which is quite time-consuming. Secondly, it is time-consuming to dispatch the application software to scan the incoming handle list once it returns Select/poll. Both of these things are related to concurrent numbers, and the density of I/O events is associated with the number of concurrent numbers, which results in an O (N2) relationship between the CPU usage and the concurrent number.
Epoll, Kqueue,/dev/poll
Because of the above reasons, *nix hacker developed Epoll, Kqueue,/dev/poll These 3 sets of tools to help everyone, let us kneel down three minutes to thank the great God. Epoll is the Linux solution, Kqueue is the FreeBSD solution,/dev/poll is the oldest Solaris solution, the use of increasing difficulty in turn.
Simply put, these APIs do two things: 1. Avoids the overhead of establishing an event-waiting structure each time the select/poll is invoked, kernel maintains a long list of event concerns, the application modifies the list through the handle, and captures the I/O events. 2. Avoid the overhead of the application scanning the entire handle table after the Select/poll returns, kernel directly returns a specific list of events to the application.
Before contacting the specific API, take a look at the concepts of edge triggers (edges trigger) and conditional triggers (level trigger). An edge trigger is an IO event that occurs whenever a state changes, and a condition trigger is an IO event that occurs whenever a condition is met. As an example of a read socket, it is assumed that after a long silence, there are now 100 bytes, at which point both the edge trigger and the conditional trigger will produce a read Ready notification notification application readable. The application reads 50 bytes, and then calls the API again to wait for IO events. The condition-triggered API will now return the user a read ready notification because there are still 50 bytes to read. And the edge-triggered API gets stuck in a long wait because the state of readable is not changed. So when using edge-triggered APIs, be aware that each time you read the socket back to Ewouldblock, the socket is discarded. Using a conditional-triggered API, if the application does not need to write, do not pay attention to the socket writable event, otherwise it will be infinite time to return a write ready notification. Everyone commonly used in the Select is a condition trigger this category, before I have committed long-term concern socket write event CPU 100% of the problem.
The related calls to Epoll are as follows:
C code int epoll_create (int size) int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event) int epoll_wait (i NT EPFD, struct epoll_event * events, int maxevents, int timeout)
Epoll_create creates the attention event table in kernel, which is equivalent to creating a fd_set.
Epoll_ctl Modify this table, the equivalent of Fd_set and other operations
epoll_wait wait for I/O events to occur, equivalent to Select/poll functions
Epoll is a completely select/poll upgrade, and the events supported are exactly the same. And Epoll supports both edge triggering and conditional triggering, and the performance of edge triggers is generally better. Here's a simple example:
C Code struct epoll_event ev, *events; int kdpfd = Epoll_create (MB); ev.events = epollin | epollet; // Note This epollet, which specifies the edge trigger ev.data.fd =listener; Epoll_ctl (Kdpfd, epoll_ Ctl_add, listener, &ev); for (;;) { nfds = epoll_wait (Kdpfd, events, maxevents, -1); for (n = 0; n < nfds; ++n) { if (Events[n].data.fd == listener) { client = accept (listener, (struct sockaddr *) &local, &nbsP; &addrlen); if (client < 0) { perror ("Accept "); continue; } setnonblocking (client); ev.events = epollin | EPOLLET; &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;EV.DATA.FD = client; if ( Epoll_ctl (Kdpfd, epoll_cTl_add, client, &ev) < 0 { fprintf (stderr, "epoll set insertion error: fd=%d0, client); return -1; } } else