Original address: http://www.cnblogs.com/haippy/archive/2012/01/09/2317269.html
------------------------
Epoll Introduction
Epoll is an extensible IO event-handling mechanism in the Linux kernel that was first introduced in the Linux 2.5.44 kernel and can be used instead of POSIX select and poll system calls, and can achieve better performance with a large number of application requests ( The number of file descriptors being monitored is very large, unlike the O (n) required to complete operations with the old Select and poll system calls, Epoll can complete operations within O (1) time, epool a special structure epoll_data, which contains details of IO requests, When IO is active, according to the information stored in the Epool_data, IO events that can be positioned within O (1) time, do not need to scan the entire FD set, so the performance is quite high), Epoll is similar to FreeBSD's Kqueue, and provides the user space with its own file descriptor to operate.
int epoll_create (int size);
Create a epoll handle, size is used to tell the kernel how much it needs to listen. When you create a good epoll handle, it will take up an FD value, under Linux if you look at the/proc/process id/fd/, you can see this fd, so after using Epoll, you must call Close () closed, otherwise it may cause FD to be depleted.
int epoll_ctl (int epfd, int op, int fd, struct epoll_event *event);
Epoll Event Registration function, the first argument is the return value of Epoll_create (), and the second parameter represents the action, which is represented by the following three macros:
Epoll_ctl_add //Registration of New FD to EPFD;
epoll_ctl_mod //Modify the listening event of the registered FD; Epoll_ctl_del// Remove an FD from EPFD ;
The third parameter is the FD that needs to be monitored, and the fourth parameter tells the kernel what to listen to, and the struct epoll_event structure is as follows:
typedef Union EPOLL_DATA
{
void *ptr;
int FD;
__uint32_t u32;
__uint64_t u64;
} epoll_data_t;
struct Epoll_event {
__uint32_t events;/* Epoll Events * *
epoll_data_t Data/* USER data variable * *
};
Events can be a collection of several macros:
Epollin//indicates that the corresponding file descriptor can be read (including normal shutdown of the socket);
epollout//indicates that the corresponding file descriptor can be written;
Epollpri//indicates that the corresponding file descriptor has an urgent data readable (here should indicate a Out-of-band data arrival);
Epollerr//indicates that the corresponding file descriptor has an error;
Epollhup//indicates that the corresponding file descriptor is hung up;
epollet//Set Epoll as Edge trigger (edge triggered) mode, which is relative to the horizontal trigger (level triggered).
epolloneshot//only listens for one event, and if you need to continue to listen to this socket after listening to the event, you will need to add this socket to the Epoll queue again.
When the other person closes the connection (FIN), Epollerr can be considered a Epollin event, with 0,-12 return values, respectively, when read.
int epoll_wait (int epfd, struct epoll_event *events, int maxevents, int timeout);
Parameter events are used to get the collection of events from the kernel, maxevents the kernel of this event, the Maxevents value cannot be greater than the size when the Epoll_create () is created, and the parameter timeout is the timeout (milliseconds, 0 will return immediately ,-1 will be uncertain, and there are claims that it is permanently blocked. The function returns the number of events that need to be handled, such as returning 0 to indicate that the timeout has expired.
Epoll events have two models level triggered (LT) and Edge triggered (ET):
LT (level triggered, horizontal trigger mode) is the default mode of operation and supports both block and Non-block sockets. In this practice, the kernel tells you whether a file descriptor is ready, and then you can io the ready fd. If you don't do anything, the kernel will continue to notify you, so this pattern is less likely to be programmed incorrectly.
ET (edge-triggered, Edge trigger mode) is a high speed working mode, only supports No-block socket. In this mode, when the descriptor is never ready to be ready, the kernel tells you through Epoll. Then it assumes you know that the file descriptor is ready, and no more ready notifications are sent for that file descriptor until the next time a new data comes in, the Ready event will be started again. Epoll Example
We will implement a simple TCP server that will print the data sent by the client on standard output, first we create and bind a TCP socket:
static int Create_and_bind (char *port) {struct addrinfo hints;
struct Addrinfo *result, *RP;
int S, SFD;
memset (&hints, 0, sizeof (struct addrinfo)); hints.ai_family = Af_unspec; /* return IPV4 and IPv6 choices * * Hints.ai_socktype = sock_stream; /* We want a TCP socket */hints.ai_flags = ai_passive;
* All interfaces */s = getaddrinfo (NULL, Port, &hints, &result);
if (s!= 0) {fprintf (stderr, "getaddrinfo:%s\n", Gai_strerror (s));
return-1; for (RP = result; Rp!= NULL; rp = rp->ai_next) {SFD = socket (rp->ai_family, Rp->ai_socktype, RP
->AI_PROTOCOL);
if (sfd = = 1) continue;
s = Bind (SFD, rp->ai_addr, Rp->ai_addrlen);
if (s = = 0) {/* We managed to bind successfully! * * break;
Close (SFD);
} if (rp = NULL) {fprintf (stderr, "could not bind\n");
return-1;
} freeaddrinfo (Result); return SFD; }
Create_and_bind () contains a block of code that creates a IPV4 and IPv6 socket that takes a string as a port parameter and returns a ADDRINFO structure in result.
struct addrinfo
{
int ai_flags;
int ai_family;
int Ai_socktype;
int Ai_protocol;
size_t Ai_addrlen;
struct sockaddr *ai_addr;
Char *ai_canonname;
struct Addrinfo *ai_next;
Returns a socket if the function succeeds, or 1 if it fails.
Below, we set a socket to be non-blocking, and the function is as follows:
static int
make_socket_non_blocking (int sfd)
{
int flags, S;
Flags = Fcntl (SFD, F_GETFL, 0);
if (flags = = 1)
{
perror ("Fcntl");
return-1;
}
Flags |= O_nonblock;
s = Fcntl (SFD, F_SETFL, flags);
if (s = = 1)
{
perror ("Fcntl");
return-1;
}
return 0;
}
Next, the main function code, mainly for the event loop:
#define MAXEVENTS
int
main (int argc, char *argv[])
{
int sfd, S;
int EFD;
struct Epoll_event event;
struct epoll_event *events;
if (argc!= 2)
{
fprintf (stderr, "Usage:%s [port]\n", argv[0]);
Exit (exit_failure);
}
SFD = Create_and_bind (argv[1]);
if (sfd = = 1)
abort ();
s = make_socket_non_blocking (SFD);
if (s = = 1)
abort ();
s = Listen (SFD, somaxconn);
if (s = = 1)
{
perror ("Listen");
Abort ();
}
EFD = epoll_create1 (0);
if (EFD = = 1)
{
perror ("Epoll_create");
Abort ();
}
EVENT.DATA.FD = SFD;
event.events = Epollin | Epollet;
s = Epoll_ctl (EFD, Epoll_ctl_add, SFD, &