First, we define the concept of a stream, a stream that can be a kernel object for I/O operations, such as files, Socket,pipe, and so on.
Whether it's files, sockets, or pipelines, we can all think of them as streams.
We then discuss the I/O operation, through read, we can read the data from the stream, and through write we can write the data to the stream. Now assume a situation where we need to read the data from the stream, but there is no data in the stream (the typical example is that the client is going to read the data from the socket, but the server has not transmitted the data back).
Blocking: What is the concept of blocking? For example, you are waiting for a courier at some time, but you do not know when the courier comes over, and you have nothing else to do (or the next thing to wait for the courier to do); then you can go to bed, because you know the courier will give you a call when delivery (assuming you can wake up).
Non-blocking busy polling: Then the above, such as the example of the courier, if using a busy polling method, then you need to know the courier's mobile phone number, and then every minute to call him: "Have you arrived?" ”
It is obvious that the average person will not use the second approach, not only to be very brain-free, waste of telephone calls, but also occupy a large number of courier staff time.
Most programs also do not use the second approach, because the first method is economical and simple, the economy refers to the consumption of very little CPU time, if the thread sleeps, it will fall out of the system scheduling queue, temporarily do not partition the CPU precious time slices.
To understand how blocking is going, let's talk about buffers, as well as kernel buffers, and finally explain the I/O events. The introduction of buffers is a frequent system call to reduce frequent I/O operations (you know it is slow), and when you manipulate a stream, it is more of a buffer unit than the user space. Buffers are also required for the kernel.
Suppose there is a pipeline, process A is the writer of the pipeline, and B is the read-out side of the pipeline.
Assuming that the kernel buffer is empty at first, B is blocked as a read-out party. Then first a to the pipeline write, when the kernel buffer from the empty state to non-empty state, the kernel will produce an event tells B to wake up, this event is called "Buffer non-empty".
However, after the "buffer non-null" event notifies B, B has not yet read the data, and the kernel has promised not to discard the data in the write pipeline, a write data will be stuck in the kernel buffer, if the kernel buffer is full, B still does not start to read the data, the final kernel buffer will be filled, this time will produce a i/ o event, tell process A, you should wait (block), we define this event as "buffer full".
Assuming b finally began to read the data, and then the kernel buffer is empty, the kernel will tell a, the kernel buffer is empty, you can wake up from the sleep, continue to write the data, we call this event "buffer is not full"
Perhaps the event Y1 has notified a, but a has no data written, and B continues to read the data, knowing that the kernel buffer is empty. This time the kernel tells B that you need to block it! , we set the time to "buffer empty".
These four scenarios cover four I/O events, buffer full, buffer empty, buffer non-empty, buffer not full (note is the kernel buffer, and these four terms are I sporogenous, only to explain the principle of the creation). These four I/O events are the root of a blocking synchronization. (If you do not understand what "sync" is, learn about the operating system's lock, semaphore, condition variables, and other task synchronization aspects.)
Then let's talk about the drawbacks of blocking I/O. However, in blocking I/O mode, a thread can handle only one stream of I/O events. If you want to work with multiple streams at the same time, either multi-process (fork) or multithreaded (pthread_create), unfortunately neither of these methods is efficient.
Then consider the I/O mode of non-blocking busy polling, and we find that we can handle multiple streams at the same time (switching a stream from blocking mode to nonblocking mode is not discussed again):
?
123456 |
< Code class= "CPP keyword bold" >while true { &NBSP;&NBSP;&NBSP;&NBSP; for i in stream[]; { Code class= "CPP Spaces" >&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; if i has data read until Unavailable &NBSP;&NBSP;&NBSP;&NBSP; } Code class= "CPP Plain" >} |
All we have to do is ask all the flow from beginning to end and start over again. This makes it possible to process multiple streams, but this is obviously not a good idea, because if all the streams have no data, the CPU will be wasted. To add that, in blocking mode, the kernel handles I/O events as blocking or waking, while non-blocking mode gives the I/O events to other objects (select and Epoll described later) and even ignores them directly.
In order to avoid the CPU idling, you can introduce an agent (at first there is a proxy called Select, and later a proxy called poll, but the two are the same nature). This agent is very powerful, can observe many streams of I/O events at the same time, in idle time, will block the current thread, when there is one or more flows with I/O events, wake up from the blocking state, so our program will poll all the stream (so we can get the "busy" word removed). The code looks like this:
?
1234567 |
while < Code class= "CPP keyword bold" >true { select (streams[]) for i in streams[] { &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; if i has data &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; read until unavailable } } |
Thus, if there is no I/O event, our program will block at select. But there's still a problem, and we just know from select that there is an I/O event, but I don't know what the flow is (there may be one, multiple, or even all), and we can only poll all the streams with no difference, find the streams that can read the data, or write the data and manipulate them.
But with SELECT, we have an O (n) non-differential polling complexity, and the more streams we handle, the longer the polling time is not once undifferentiated. Again
Having said so much, I can finally explain epoll.
Epoll can be understood as the event poll, unlike the busy polling and non-differential polling, the Epoll will notify us of which stream I/O event occurred. At this point we make sense of the operations of these streams. (Reduced complexity to O (1))
Before discussing the implementation details of Epoll, the Epoll related actions are listed:
Epoll_create Create a Epoll object, general EPOLLFD = Epoll_create ()
Epoll_ctl (Epoll_add/epoll_del) to add/Remove an event from a stream to a Epoll object
Like what
Epoll_ctl (EPOLLFD, Epoll_ctl_add, Socket, epollin);//Register buffer non-empty event, there is data inflow
Epoll_ctl (EPOLLFD, Epoll_ctl_del, Socket, epollout);//register buffer not full event, that is, the stream can be written
Epoll_wait (EPOLLFD,...) Wait until the registered event occurs
(Note: When read-write to a non-blocking stream occurs when the buffer is full or the buffer is empty, Write/read returns-1 and sets the Errno=eagain. Epoll only cares about buffer non-full and buffer non-empty events).
The code for a epoll pattern might look like this:
?
123456 |
while true { active_stream[] = epoll_wait(epollfd) for i in active_stream[] { read or write till } } |
Confined to space, I only say so much to reveal the original rational things, as for the use of epoll details, please refer to man and Google, for details, see Linux kernel source.
Epoll foreplay "Turn"