What is the principle of epoll or kqueue?

Source: Internet
Author: User
Tags epoll

First, let's define the concept of a stream. A stream can be a kernel object that can perform I/O operations, such as files, sockets, and pipe.
We can regard files, sockets, and pipelines as streams.
Next we will discuss the I/O operations. Through read, we can read data from the stream; through write, we can write data to the stream. Now assume that we need to read data from the stream,But there is no data in the stream.(A typical example is that the client needs to read data from a socket, but the server hasn't returned the data). What should I do at this time?

  • Blocking. What is blocking? For example, you are waiting for the express delivery at some time, but you do not know when the express delivery will come, and you have nothing to do (or the next thing must be done by the express delivery ); then you can go to bed, because you know that the express delivery will definitely give you a call (assuming you will be able to wake you up ).
  • Non-blockingBusyRound Robin. Next, let's wait for the example above. If you use the round-robin method, you need to know the courier's mobile phone number, and then call him every minute: "Have you arrived yet ?"

Obviously, most people do not use the second method, which is not only brainless, but also a waste of phone calls and takes a lot of time for couriers.
Most programs will not use the second method, because the first method is economical and simple. The economy refers to the consumption of a small amount of CPU time. If the thread is sleep, the system's scheduling queue will be dropped, at the moment, we will not split up valuable CPU time slices.

To understand how blocking is implemented, we will discuss the buffer and kernel buffer, and finally explain the I/O events clearly. Buffer is introduced to reduce frequent system calls caused by frequent I/O operations (you know it is slow). When you operate a stream, more operations are performed in the buffer zone, which is relative to the user space. A buffer is also required for the kernel.
Assume that there is an MPS queue. process a is the writer of the MPs queue, and process B is the reader of the MPs queue.

  1. Assume that the kernel buffer is empty at the beginning, and B is blocked as the reader. Then a writes data to the pipeline. At this time, the kernel buffer changes from the empty state to the non-empty state, and the kernel will generate an event to tell B to wake up, this event is called "the buffer zone is not empty ".
  2. However, after the "non-empty buffer" event notifies B, B has not read the data, and the kernel promises not to discard the data written into the pipeline, the data written by a will be stuck in the kernel buffer. If the kernel buffer is full, B will not start to read the data, and the kernel buffer will be filled up, at this time, an I/O event will be generated, telling process a that you should wait (blocking). We define this event as "the buffer is full ".
  3. Assuming that B finally began to read data, and the kernel buffer is empty, then the kernel will tell a that there is space in the kernel buffer, and you can wake up from sleep, data Writing continues. We call this event "the buffer is not full"
  4. Maybe event Y1 has notified a, but a has no data written, and B continues to read the data, knowing that the kernel buffer is empty. At this time, the kernel tells B that you need to block it !, We set this time as "the buffer zone is empty ".

These four situations cover four I/O events, the buffer is full, the buffer zone is empty, the buffer zone is not empty, and the buffer zone is not full, and these four terms are all created by me to explain their principles ). These four I/O events are fundamental to blocking synchronization. (If you cannot understand the concept of "synchronization", learn about the synchronization tasks of the operating system, such as the lock, semaphore, and conditional variables ).

Then let's talk about the disadvantages of blocking I/O. However, in blocking I/O mode, a thread can only process the I/O events of one stream. If you want to process multiple streams at the same time, either multi-process (fork) or multi-thread (pthread_create), unfortunately these two methods are not efficient.
So let's take a look at the I/O method of non-blocking busy polling. We found that we can process multiple streams at the same time (switching a stream from the blocking mode to the non-blocking mode will not be discussed here ):
While true {
For I in stream []; {
If I has data
Read until unavailable
}
}
We just need to keep asking all the streams from start to end and start from the beginning. In this way, you can process multiple streams, but this is obviously not good, because if all the streams have no data, it will only waste the CPU. In blocking mode, the kernel can block or wake up the I/O events, in non-blocking mode, I/O events are handed over to other objects (select and epoll described later) for processing or even directly ignoring.

To avoid CPU idling, a proxy can be introduced (one is called the select proxy at the beginning, and another is called the poll proxy at the beginning, but the two are essentially the same ). This proxy is very powerful. You can observe the I/O events of many streams at the same time. When you are idle,Will block the current threadWhen one or more streams have an I/O event, they wake up from the blocking state, so our program will round-robin all the streams (so we can remove the word "busy ). The code length is as follows:
While true {
Select (streams [])
For I in streams [] {
If I has data
Read until unavailable
}
}
Therefore, if no I/O event is generated, our program will be blocked at select. But there is still a problem. We only know from select that an I/O event has occurred, but we do not know the several streams (there may be one or more, or even all), we can onlyNo difference round robinAll streams, find the streams that can read or write data, and operate on them.
However, when we use select, we have no difference in the round robin complexity of O (n). The more streams we process at the same time, the longer the polling time for each round. Again
After talking so much, I can finally explain epoll well.
Epoll can be understood as event poll. Unlike busy polling and non-differential polling, epoll will notify us of what kind of I/O event happens to which stream. In this case, the operations on these streams are meaningful. (Complexity reduced to O (1 ))
Before discussing epoll implementation details, we should list epoll related operations:

  • Epoll_create creates an epoll object. Generally, epollfd = epoll_create ()
  • Epoll_ctl (the combination of epoll_add/epoll_del) to add/delete an event of a stream to the epoll object
    For example
    Epoll_ctl (epollfd, epoll_ctl_add, socket, epollin); // registers a non-empty event in the buffer zone, that is, inbound data.
    Epoll_ctl (epollfd, epoll_ctl_del, socket, epollout); // registers a non-full event in the buffer zone, that is, the stream can be written.
  • Epoll_wait (epollfd,...) waits until the registration event occurs.

(Note: When the read/write of a non-blocking stream is full or the buffer zone is empty, write/read will return-1 and set errno = eagain. Epoll only cares about non-full buffer and non-empty buffer events ).
The code in an epoll mode looks like:
While true {
Active_stream [] = epoll_wait (epollfd)
For I in active_stream [] {
Read or write till unavailable
}
}

For more details about epoll, see man and Google. For details about how to use epoll, see Linux kernel source.

The epoll principle is:
You hand over the file to be monitored for read/write to the kernel (epoll_add)
Set the events you care about (epoll_ctl), such as read events
Then wait (epoll_wait). At this time, if no file has an event you are concerned about, sleep until there is an event that is awakened.
Then return those events

To implement concurrency, you also need to use non-blocking read/write. In this way, you can collect tokens to put the file (socket), and then read and write the token to put the file (not because a file is slow and blocked), so as to achieve concurrency.


At the lowest layer of the kernel, It is interrupted.
The mechanism similar to the system callback is not polling. On this basis, let's look at the @ blue parameter answer.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.