Event Poll epoll details, pollepoll
Due to the limitations of poll () and select (), the 2.6 kernel introduces the event poll (epoll) mechanism. Although a little complicated, epoll solves their common basic performance problems and adds some new features.
Each call of poll () and select () requires all the file descriptors to be listened on. The kernel must traverse all monitored file descriptors. When this table becomes very large, the traversal of each call becomes a significant bottleneck with hundreds of file descriptors.
1. Create a New epoll instanceUse epoll_create () or epoll_cerate1 () to create an epoll context. Here epoll_cerate1 () is an extended version of epoll_cerate.
#include <sys/epoll.h>int epoll_create (int size)
After the call is successful, epoll_create () creates an epoll instance and returns the file descriptor associated with the instance. This file descriptor has nothing to do with the real file. It is only created to call epoll later. The size parameter indicates the number of file descriptors to be monitored by the kernel, but not the maximum value. Passing an appropriate approximate value will improve performance, but no exact number is required. When an error occurs,-1 is returned, and errno is set to one of the following values:
EINVALSize is not a positive number
ENFILEThe maximum number of opened files has been reached.
ENOMENThere is not enough memory to complete this operation.
The standard call is as follows:
int epfd;epfd = epoll_create (100); if (epfd <0 )perror("epoll_create");
The file descriptor returned by epoll_create must be closed with close.
2. epoll Control
Epoll_ctl can be used to add or delete file descriptors to the specified epoll context:
#include <sys/epoll.h>int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event);
The epoll event struct is defined in the header file <sys/epoll. h>.
struct epoll_event {_u32 events;union {void * ptr;int fd;_u32 u32;_u64 u64;}data;};
Epoll_ctl () is associated with the epoll instance and epfd. The op parameter specifies the operation to be performed on fd. The event parameter describes more specific epoll behaviors.
The following are the valid values of the op parameter:
EPOLL_CTL_ADDAdd the file specified by fd to the epoll instance listening set specified by epfd to listen for events defined in the event.
EPOLL_CTL_DELDelete the file specified by fd from the epoll listener set specified by epfd.
EPOLL_CTL_MODUse event to change the listening behavior on the existing fd.
The event parameter in the epoll_event struct lists the events monitored on the given file descriptor. Multiple events can be specified simultaneously using bitwise OR operations. The following are valid values:
EPOLLERRFile error. Even if this flag is not set, this event is also monitored.
EPOLLETEdge trigger. The default value is horizontal triggering.
EPOLLHUPThe file is suspended. Even if this flag is not set, this event is also monitored.
EPOLLINThe file is not blocked and readable.
EPOLLONESHOTAfter an event is generated and processed, the file is not monitored. It must not be listened on again. You must use EPOLL_CTL_MOD to specify a new event to re-Listen to the file.
EPOLLOUTThe file is not blocked and can be written.
EPOLLPRIHigh-priority out-of-band data is readable.
The data field in event_poll is used by the user. After the listener event is confirmed, data is returned to the user. Generally, the event. data. fd is set to fd, so that you can know the file descriptor trigger event.
When the call succeeds, epoll_ctl () returns 0. If the call fails,-1 is returned, and errno is set to the following values:
EBADFEpfd is not a valid epoll instance, or fd is not a valid file descriptor.
EEXISTOp is EPOLL_CTL_ADD, but fd is already associated with epfd.
EINVALEpfd is not an epoll instance. epfd is the same as fd, or op is invalid.
ENOENTOp is EPOLL_CTL_MOD or EPOLL_CTL_DEL, but fd is not associated with epfd.
ENOMENThere is not enough memory to complete the process request.
EPERMFd does not support epoll.
Add a listener file specified by fd to the epfd instance and use the following code:
struct epoll_event event;int ret;event.data.fd = fd;/*return the fd to us later*/event.events = EPOLLIN|EPOLLOUT ;ret = epoll_ctl (epfd,EPOLL_CTL_MOD,fd,&event);if (ret)perror ("epoll_ctl");
You can use the following code to modify a listener event on the fd of an epfd instance:
struct epoll_event event;int ret;event.data.fd = fd;/*return the fd to us later*/event.events = EPOLLIN ;ret = epoll_ctl (epfd,EPOLL_CTL_MOD,fd,&event);if (ret)perror ("epoll_ctl");
To delete an fd listening event, use the following code:
struct epoll_event event;int ret;event.data.fd = fd;/*return the fd to us later*/event.events = EPOLLIN ;ret = epoll_ctl (epfd,EPOLL_CTL_DEL,fd,&event);if (ret)perror ("epoll_ctl");
3. Wait for the Epoll eventEpoll_wait:
#include <sys/epoll.h>int epoll_wait (int epfd, struct epoll_event * * events, int maxevents, int timeout);
Call epoll_wait () to wait for the event on the file fd in the epfd of the epoll instance. The time limit is timeout milliseconds. If a success is returned, events points to the memory containing the epoll_event struct (which describes each event,
You can have a maximum of maxevents events. The returned value is the number of events, error-1 is returned, and errno is set to the following values
EBADFEpfd is an invalid file descriptor.
EFAULTThe process has no write permission on the memory indicated by events.
EINTRSystem Call is interrupted by signal before completion
EINVALEpfd is not a valid epoll instance, or maxevents is less than or equal to 0
If the timeout value is 0, the call will occur immediately even if no event occurs. At this time, the call will return 0. If the timeout value is-1, the call will wait until an event occurs.
When epoll_wait () is called to return, the events array in the epoll_event struct describes a waiting event, and a maximum of maxevents events are returned. The data field contains the user's settings before calling epoll_ctl,
For example, the file handle is used to differentiate the events that occur in the file.
An example of a complete epoll_wait () is as follows:
#define MAX_EVENTS 64struct epoll_event * events = NULL;int nr_events, i, epfd;events = malloc (sizeof(struct epoll_event) * MAX_EVENTS);if (! events ){perror("malloc");exit(-1);}nr_events = epoll_wait (epfd,events,MAX_EVENTS,-1);if (nr_events < 0){perror("epoll_wait");free(events);exit (-1);}for (int i=0; i<nr_eventsl i++)printf("event = %d on fd = %d \n",events[i].events,events[i].data.fd);
4. Edge trigger time and horizontal trigger eventThere are two types of EPOLL events:Level Triggered (LT)AndEdge Triggered (ET ):
LT (level triggered, horizontal trigger mode) is the default mode of work, and supports both block and non-block socket. In this way, the kernel tells you whether a file descriptor is ready, and then you can perform IO operations on this ready fd. If you do not perform any operation, the kernel will continue to inform you, so the possibility of programming errors in this mode is lower.
ET (edge-triggered, edge trigger mode) is a high-speed operation mode that only supports no-block socket. In this mode, when the descriptor is never ready, the kernel tells you through epoll. Then it will assume that you know that the file descriptor is ready and will not send more ready notifications for that file descriptor. The event will not be ready again until new data comes in next time.
5. Instances in man epollThe setnonblocking () function sets the socket file to non-blocking because the ET mode is used. Do_use_fd () is used to process this file, such as reading and writing.
#define MAX_EVENTS 10struct epoll_event ev, events[MAX_EVENTS];int listen_sock, conn_sock, nfds, epollfd;/* Set up listening socket, 'listen_sock' (socket(), bind(), listen()) */epollfd = epoll_create(10);if (epollfd == -1) { perror("epoll_create"); exit(EXIT_FAILURE);}ev.events = EPOLLIN;ev.data.fd = listen_sock;if (epoll_ctl(epollfd, EPOLL_CTL_ADD, listen_sock, &ev) == -1) { perror("epoll_ctl: listen_sock"); exit(EXIT_FAILURE);}for (;;) { nfds = epoll_wait(epollfd, events, MAX_EVENTS, -1); if (nfds == -1) { perror("epoll_pwait"); exit(EXIT_FAILURE); } for (n = 0; n < nfds; ++n) { if (events[n].data.fd == listen_sock) { conn_sock = accept(listen_sock, (struct sockaddr *) &local, &addrlen); if (conn_sock == -1) { perror("accept"); exit(EXIT_FAILURE); } setnonblocking(conn_sock); ev.events = EPOLLIN | EPOLLET; ev.data.fd = conn_sock; if (epoll_ctl(epollfd, EPOLL_CTL_ADD, conn_sock, &ev) == -1) { perror("epoll_ctl: conn_sock"); exit(EXIT_FAILURE); } } else { do_use_fd(events[n].data.fd); } }}
Why is epoll so fast? How epoll works
It is explained by an example in life. assume that you are studying in college and want to wait for A friend to visit. This friend only knows that you are in building A, but does not know where you actually live, so you have an appointment to meet at the door of Building. if you use the block I/O model to solve this problem, you can only stay at the door of Building A and wait for your friends. During this time, you cannot do anything else, it is hard to know that this method is inefficient. further explain the differences between the select and epoll models. the select edition aunt is doing the following: for example, a friend of classmate A is coming, and the select edition aunt is stupid. She takes a friend to a room to query who is classmate A. you are waiting for a friend to come, in actual code, the select edition aunt does the following: int n = select (& readset, NULL, NULL, 100); for (int I = 0; n> 0; ++ I) {if (FD_ISSET (fdarray [I], & readset) {do_something (fdarray [I]); -- n ;}} epoll is more advanced. She wrote down the information of classmate Jia, for example, his room number. When a friend of classmate Jia arrives, you just need to tell your friend the room where Jia is. You don't have to bring someone in the building. the following code can be used to show what epoll daema does: n = epoll_wait (epfd, events, 20,500); for (I = 0; I <n; ++ I) {do_something (events [n]);} In epoll, the key data structure epoll_event is defined as follows: typedef union epoll_data {void * ptr; int fd; _ uint32_t u32; _ uint64_t u64;} epoll_data_t; struct epoll_event {_ uint32_t events;/* Epoll events */epoll_data_t Data;/* User data variable */}; you can see that epoll_data is a union structure, which is the structure used by the epoll big mom to save the student information. It can save many types of information: fd, pointer, etc. with this structure, epoll can locate classmate A without any effort. don't underestimate these improvements. In a large-scale concurrent server, polling IO is one of the most time-consuming operations. back in that example, if every time a friend, Lou Guan Dama, needs to query the whole building, the processing efficiency will inevitably be low, and there will be many people at the bottom of the building soon. comparing the earliest model of blocking IO processing, we can see that after multiplexing IO is adopted, the program can freely perform its own work except IO operations, only when the I/O status changes will I be notified by multiplexing I/O, and then corresponding operations will be taken, instead of blocking until the I/O status changes. from the above analysis, we can also see that the improvement of epoll over select is actually a thought of using space for time. Specific applications II. in-depth understanding of epoll implementation principles: when developing high-performance network programs, windows developers must call iocp, while linux developers must call epoll. Everyone understands that epoll is an IO multiplexing technology that can efficiently process millions of socket handles, which is much more efficient than the previous select and poll. We can use epoll to make it feel pretty cool and fast, so why can it process so many concurrent connections at high speed? Let's briefly review how to use the three epoll system calls encapsulated by the C library. Int epoll_create (int size); int epoll_ctl (int epfd, I ...... remaining full text>
Int epoll_ctl (int epfd, int op, int fd, struct epoll_event * event); Fourth Parameter
It is obvious that the pointer is passed in, which will change the value of the pointer structure without the copy operation.