How to process massive connections using epoll in Linux

Source: Internet
Author: User

Recently, I 've been reading some things about Linux C Server programming, and it involves a lot of connection processing problems. We know that in Linux, we generally use the TCP/IP protocol to write simple C/S model code and often use the select () function. It is a function used to determine the status of one or more sets of interfaces. For each set of interfaces, the caller can query its readability, writability, and error status information. The fd_set structure (Google) is used to represent a set of interfaces waiting for inspection. It can be used to implement multiplexing I/O models, the so-called non-blocking mode (that is, when the process or thread executes this function, it does not have to wait for the event to occur. Once it is executed, it will definitely return, different return values are used to reflect the execution of the function. The event occurs in the same way as the blocking method. If the event does not occur, a code is returned to inform the event that it has not occurred, while the process or thread continues to execute, so the efficiency is high.) I will not go into details here.

The biggest difference between epoll and select is that the fd_set used by select is limited, and its size is defined by the _ fd_setsize parameter in the kernel (2048 by default ). To change the size, you need to re-compile the Linux kernel. Epoll is not like that. Its FD ceiling is the maximum number of files that can be opened. This number is generally greater than 2048. You can use CAT/proc/sys/fs/file-Max to view the launch value of your machine. Other advantages include kernel fine-tuning to accelerate information transmission between the kernel and user space. These involve the Linux kernel mechanism .. I am not very familiar with it.

The following describes how to use epoll.

Epoll first creates an epoll handle through create_epoll (INT maxfds. (Of course, the most basic step is to create a series of socket operations), where maxfds is the maximum number of handles supported by epoll. And then in the main loop of your network (that is, the while (1) {...} that we often write ){...}), call epoll_wait (INT epfd, epoll_event events, int Max events, int timeout) at each link time to query all network interfaces and view their readability and writability.

Here, kdpfd is the handle created with epoll_create. events is a pointer to epoll_event *. When the epoll_wait function becomes a function, epoll_events stores all read and write events. Max_events is the number of all socket handles to be monitored. The last timeout is the epoll_wait timeout. If it is 0, it indicates immediate return. If it is-1, it indicates waiting until there is an event range. Generally, if the main loop of the network is a separate thread, you can use-1 to ensure some efficiency. If it is in the same thread as the main logic, 0 can be used to ensure the efficiency of the main loop. After epoll_wait (), you must traverse the event in a loop.

For (n = 0; n <NFDs; ++ N) {If (events [N]. data. FD = listener) {/indicates that a new connection is in progress and the connection is processed. Client = accept (listener, (struct sockaddr *) & Local, & addrlen); If (client <0) {perror ("accept"); continue;} setnonblocking (client ); // place the new connection in non-blocking mode. Use fcntl () to implement (fcntl (sockfd, f_setfl, fcntl (sockfd, f_getfd, 0) | o_nonblock); eV. events = epollin | epollet; // Note: The epollin parameter here | epollet does not set a listener for writing socket. If there is a write operation, epoll will not return events at this time, if you want to listen for write operations, it should be epollin | epollout | epollet eV. data. FD = client; If (ePol L_ctl (kdpfd, epoll_ctl_add, client, & eV) <0) {// After setting the event, add the new event to the epoll listening queue through epoll_ctl, here, epoll_ctl_add is used to add a new epoll event, and epoll_ctl_del is used to reduce an epoll event. epoll_ctl_mod is used to modify the event listening mode. Fprintf (stderr, "epoll set insertion error: FD = % D0, client); Return-1 ;}} else // if it is not a main socket event, it indicates a user socket event to handle this event. (For example, send messages between the corresponding sockets and receive send () | Recv ()...) Handle_message (events [N]. Data. FD );}

You can use close () to close the epoll handle. Only epoll_create, epoll_ctl, epoll_wait, and close functions are involved.

Generally, in order to improve server efficiency, you can use different threads to process event monitoring and processing. This is more efficient.


Comparison of epoll searched on the network with other methods:

PPC/TPC Model

These two models share the same idea, that is, to let every incoming connection do things on its own, so don't bother me any more. Only PPC opens a process for it, while TPC opens a thread. But don't bother me. There is a price. It requires time and space. When there are too many connections, so many processes/threads will switch, and this overhead will come up; therefore, the maximum number of connections that this type of model can accept is not high, generally about several hundred.

Select model

The maximum number of concurrent threads. Because the FD (file descriptor) opened by a process is restricted, www.linuxidc.com is set by fd_setsize. The default value is 1024/2048, therefore, the maximum concurrency of the select model is limited accordingly. Modify fd_setsize by yourself? Although the idea is good, let's take a look at the following...

Efficiency problem: each select call will linearly scan all FD sets. This will result in a linear decline in efficiency. The consequence of increasing the fd_setsize is that everyone is coming slowly. What? All timeout ??!!

How can I notify the kernel of the FD message to the user space when copying the kernel/user space memory? On this issue, select adopts the memory copy method.

Poll Model

Basically, the efficiency is the same as that of select. The two and three disadvantages of select are not modified.

Epoll Improvement

I criticized other models one by one. Let's take a look at epoll's improvements. In fact, the disadvantages of select are the advantages of epoll.

Epoll has no limit on the maximum number of concurrent connections. The maximum number is the maximum number of files that can be opened. This number is generally greater than 2048. Generally, this number has a great relationship with the system memory, you can view the specific number in CAT/proc/sys/fs/file-max.

The biggest advantage of epoll is that epoll only manages your "active" connections, regardless of the total number of connections. Therefore, in the actual network environment, epoll is much more efficient than select and poll.

Memory copy, epoll uses "Shared Memory" at this point, this memory copy is also omitted

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.