Analysis of five IO models and five I/O models

Source: Internet
Author: User
Tags epoll

Analysis of five IO models and five I/O models

Directory:
1. Basics
2. I/O model
2.1 Blocking I/O model
2.2 Non-Blocking I/O model
2.3 I/O Multiplexing Model
2.4 Signal-driven I/O model
2.5 Asynchronous I/O model
2.6 distinction between synchronous IO and asynchronous IO, blocking, and non-blocking
3. select (), poll (), and epoll
3.1 select () & poll ()
3.2 epoll

1. Basics

Before introducing the IO model, explain the "experience" of a piece of data while I/O is waiting.

When a program or an existing process/thread (which will not be differentiated but considered as a process) needs some data, it can only be accessed and modified in the user space's own memory. This memory is temporarily called app buffer. If the required data is on the disk, the process must first initiate a system call to notify the kernel to load files on the disk. But normally, data can only be loaded to the kernel buffer, which is now called the kernel buffer. After the data is loaded to the kernel buffer, you also need to copy the data to the app buffer. Now, the process can access and modify the data.

There are several issues to be addressed.

(1). Why cannot I directly load data to the app buffer??

Actually, some programs or hardware can achieve the kernel bypass function to improve efficiency and performance, avoiding kernel involvement and directly transmitting data between the storage device and app buffer, for example, RDMA technology needs to implement such a kernel bypass function.

However, in most cases, data must first be written into the kernel buffer for security and stability, and then copied to the app buffer to prevent the process from being broken into the kernel space.

(2) How many data copies are mentioned above? Is the copy method the same??

Different. The current storage devices (including NICS) basically support DMA operations. What is DMA (direct memory access, direct memory access )? Simply put, data interaction between the memory and the device can be directly transmitted without the involvement of the computer's CPU, but through the chip on the hardware (which can be simply considered as a small cpu).

Assume that the storage device does not support DMA, the data is transmitted between the memory and the storage device, the CPU of the computer must be used to calculate the address from which data is obtained, the address from which the cursor is sent to the other party, and the number of incoming data blocks (number of data blocks and where the data blocks are, the CPU has to do a lot of work for only one data transmission. The DMA releases the computer's CPU so that it can process other tasks.

In addition, the replication mode between the kernel buffer and the app buffer is used for data transmission between two segments of memory space, which can only be controlled by the CPU.

Therefore, the process of loading hard disk data to the kernel buffer is the dma copy method, while the process from the kernel buffer to the app buffer is the copy method involved by the CPU.

(3) What should I do if data is transmitted over TCP connections??

For example, the response data of the web service to the client must be transmitted to the client through a TCP connection.

The TCP/IP protocol stack maintains two buffers: send buffer and recv buffer, which are collectively called socket buffer. Data that needs to be transmitted over TCP connections must be copied to the send buffer before being transmitted to the NIC over the network. If data is received through a TCP connection, the data first enters the recv buffer through the nic and is copied to the app buffer of the user space.

Similarly, when data is copied to the send buffer or from the recv buffer to the app buffer, the data is copied by the CPU. When you copy data from the send buffer to the NIC or from the NIC to the recv buffer, the data is copied using the DMA Operation.

As shown in, it is the process of data transmission through a TCP connection.

(4) Do network data need to be copied from the kernel buffer to the app buffer and then to the send buffer??

No. If the process does not need to modify data, it is directly sent to the other end of the TCP connection. instead of copying data from the kernel buffer to the app buffer, it is directly copied to the send buffer. This isZero ReplicationTechnology.

For example, when httpd does not need to access or modify any information, it copies the original data to the app buffer, then the original data to the send buffer, and then transmits the data. However, the process of copying data to the app buffer can be omitted. With the zero replication technology, you can reduce a copy process and improve efficiency.

Of course, there are many ways to implement the zero copy technology. For more information, see my other article about the zero copy technology.

The following is a complete data operation process when the httpd process processes file requests.

Generally, the client initiates a request to a file and connects it over TCP to request data to enter the TCP recv buffer. Then, the data is read to the app buffer through the recv () function, at this time, the httpd Working Process parses the data and knows that a file is requested, so it initiates a system call (for example, to read the file and initiate read ()), therefore, the kernel loads the file, copies the data from the disk to the kernel buffer, and then to the app buffer. At this time, httpd starts to construct the response data, and may modify the data, for example, add a field in the Response Header, copy the modified or unmodified data (such as the send () function) to the send buffer, and then transmit it to the client through a TCP connection.

2. I/O model

The so-called I/O model describes the process status and the way in which data is processed when an I/O wait occurs. The Process status, data preparation to the kernel buffer, and then to the app buffer. The process of copying data to the kernel buffer is calledData PreparationThe process of copying data from the kernel buffer to the app buffer is calledData ReplicationPhase. Remember these two concepts. They will always be used when I/O models are described later.

This article uses the httpd process's TCP connection method to process local files as an example. Please ignore whether httpd has implemented such and general functions, and ignore the data processing details of the TCP connection, this is just an example for easy explanation. In addition, this article uses a local file as the object of the I/O model is not very suitable, and its major concern is on the socket, if you want to view the I/O model of the socket in the Process of processing TCP/UDP, after reading this article, combined with my other article "unknown socket and TCP connection processes", I/O models were re-recognized.

Again, the data transmission process from hardware devices to memory does not require the CPU to participate, but the data transmission between memory requires the CPU to participate.

2.1 Blocking I/O model

If the client initiates a file request for index.html, httpdloud loads the data of index.html from the disk to its httpd app buffer, and then copies the data to the send buffer for sending.

When httpdwants to upload index.html, it first checks the data corresponding to index.html in its app bufferand initiates a system call to let the kernel load data, such as read (), the kernel first checks that its kernel buffercontains data corresponding to index.html. If not, it loads data from the disk, prepares the data to the kernel buffer, and then copies the data to the app buffer, finally, it is processed by the httpd process.

If you use the Blocking I/O model:

(1) When the block I/o model is set, httpd is blocked from the beginning.
(2) httpd is awakened to process the data in the app buffer only when the data is copied to the app buffer or an error occurs.
(3) The cpu goes through two context switches: from the user space to the kernel space and then to the user space.
(4) because the CPU does not need to be involved in the copying of the phase, the cpu can process the tasks of other processes during the data preparation process.
(5). Data replication in the phase requires the CPU to participate. Blocking httpd will help increase the copy speed to some extent.
(6). This is the easiest and simplest IO mode.

For example:

2.1 Non-Blocking I/O model

(1) when it is set to non-blocking, httpd immediately returns an error value EWOULDBLOCK (As for whether read () can return EWOULDBLOCK when reading a common file, please ignore it. After all, the I/O model is mainly for socket files, just when read () is recv ()Instead of putting httpd into sleep. This is exactly what UNP describes.

When we set a socket to be nonblocking, we are telling the kernel "when an I/O operation that I request cannot be completed without putting the process to sleep, do not put the process to sleep, but return an error instead.

(2). Although read () returns immediately, httpd must constantly send read () to check the kernel: has the data been successfully copied to the kernel buffer? This is called polling ). During each round robin, as long as the kernel does not prepare the data, read () returns the error message EWOULDBLOCK.
(3). After the data preparation in the kernel buffer is complete, no EWOULDBLOCK is returned when the polling is performed. Instead, httpd is blocked to wait for the data to be copied to the app buffer.
(4). httpd is not blocked at the end of the phase, but will continuously send read () Round Robin. When it is blocked, the cpu is handed over to the kernel to copy the data to the app buffer.

For example:

2.3 I/O Multiplexing Model

It is called a multi-channel IO model or IO multiplexing, which means that multiple I/O wait statuses can be checked. There are three IO multiplexing models: select, poll, and epoll. In fact, they are all a function used to monitor whether the data of the specified file descriptor is ready. Readiness means that a system call is no longer blocked. For example, for read, that is, if the data is ready, it is ready. The ready type includes whether the data is readable, whether it is writable, and whether it is abnormal. The readable condition includes whether the data is prepared. When it is ready, the process will be notified, and the process will then send a system call to the data operation, such as read (). Therefore, these three functions only process whether data is ready and how to notify the process. These functions can be used in combination with the blocking and non-blocking IO modes. For example, when the function is set to non-blocking, select ()/poll ()/epoll will not be blocked on the corresponding descriptor, the process/thread that calls the function will not be blocked.

Select () is similar to poll (). Their monitoring and notification methods are the same, but poll () is smarter, so only select () is used here () monitoring a single file request is used as an example to briefly describe IO reuse. For more specific methods to monitor multiple files and epoll, this article provides a special explanation.

(1 ). when you want to load a file, if httpd wants to initiate a read () System Call and if it is blocked or not blocked, read () determines whether to return the data based on whether the data is prepared, can I take the initiative to monitor whether the data has been prepared to the kernel buffer, or can I monitor whether new data enters the send buffer? This is the role of select ()/poll ()/epoll.
(2) When select () is used, httpd initiates a select call and the httpd process is blocked by select ". Since only one request file is monitored, select () will directly wake up the httpd process when data is prepared to the kernel buffer. Double quotation marks are added for blocking because select () has a time interval option to control the blocking duration. If this option is set to 0, select is not blocked, at this time, the system returns immediately but keeps polling to check whether it is ready. You can also set it to permanent blocking.
(3 ). when the monitoring object of select () is ready, the httpd process will be notified (polling) or Awakened (blocking), and the httpd will then initiate a read () System Call, data is copied from the kernel buffer to the app buffer and read () is successful.
(4). httpd is blocked after initiating the second system call (that is, read (), and the CPU is all handed over to the kernel for copying data to the app buffer. (5 ). when httpd only processes one connection, the IO reuse model is not as good as the blocking I/O model, because it initiates two system calls (select () and read () before and after ()), CPU usage is continuously consumed even during polling. However, the advantage of IO reuse lies in the ability to monitor multiple file descriptors at the same time.

For more details, see the end of this article.

2.4 Signal-driven I/O model

That is, the signal-driven I/O model. When the signal-driven function is enabled, a system call for signal processing, such as sigaction (), is initiated, and the system call is immediately returned. However, when the data is ready, it will send a SIGIO signal. When the process receives this signal, it will know that the data is ready, so it initiates a system call to operate the data, such as read ().

After a system call that initiates signal processing, the process will not be blocked. However, when read () Copies data from the kernel buffer to the app buffer, the process will be blocked.

2.5 Asynchronous I/O model

That is, the asynchronous IO model. When it is set to an asynchronous IO model, httpd first initiates an asynchronous system call (such as aio_read () and aio_write () and returns immediately. This asynchronous system call tells the kernel not only to prepare data, but also to copy the data to the app buffer.

Httpd will not be blocked from the return until the data is copied to the app buffer. When the data is copied to the app buffer, a signal is sent to the httpd process.

It looks good to be asynchronous, but note that the CPU is required to participate in the process of copying the kernel buffer data to the app buffer, which means that the blocked httpd will compete with the asynchronous call function to use the CPU. If the concurrency is large, the more connections the httpd accesses, the more serious the CPU contention, and the slower the asynchronous function returns a successful signal. If the problem cannot be well handled, the asynchronous IO model is not necessarily good.

2.6 distinction between synchronous IO and asynchronous IO, blocking, and non-blocking

Blocking, non-blocking, IO multiplexing, and signal driving are all synchronous I/O models. This is because the system call that initiates the operation data (such as the read () in this article) is blocked. Note that although the data preparation process for loading data to the kernel buffer may be blocked or not blocked, the kernel buffer is the operation object of the read () function, synchronization means to synchronize data between the kernel buffer and the app buffer. Obviously, the process must be blocked while the kernel buffer and app buffer are synchronized; otherwise, the read () will become asynchronous read ().

Only the asynchronous IO model is asynchronous, because the initiated asynchronous Class System Call (such as aio_read () has prepared data no matter when the kernel buffer is prepared, just like the background read, aio_read () can always wait for data in the kernel buffer. After preparation, aio_read () can naturally copy the data to the app buffer.

3 select (), poll (), and epoll

As mentioned above, these three functions are file descriptor state monitoring functions, which can monitor a series of events in a series of files. When an event meets the conditions, it is considered ready or wrong. There are roughly three types of events: readable events, writable events, and exception events. They are usually placed in the loop structure for cyclic monitoring.

The Processing Methods of select () and poll () functions are similar in nature, but poll () is a little more advanced, and epoll is much more advanced than the two functions. Of course, in some cases, even advanced elements do not necessarily have better performance than the old ones.

3.1 select () & poll ()

First, use the FD_SET macro function to create a set of descriptors to be monitored and use this set as a parameter of the select () function. You can specify the select () function blocking interval, so select () creates a monitoring object.

In addition to common file descriptors, you can also monitor sockets. Because sockets are also files, select () can also monitor socket file descriptors, such as whether data is received in recv buffer, that is, to monitor the readability of the socket, whether the send buffer is full, that is, to monitor the writability of the socket. By default, select () can monitor a maximum of 1024 file descriptors. Poll () does not have this restriction.

The time interval parameters of select () are divided into three types:
(1). Set to block within a specified interval, unless a ready event occurs.
(2). Set to permanent blocking unless a ready event occurs.
(3) set to completely non-blocking, that is, return immediately. But because select () is usually in the loop structure, this is the method of polling monitoring.

After a monitoring object is created, the kernel monitors these descriptor sets, and the process that calls select () at the same time is blocked (or polling ). When the conditions for readiness are monitored (when a monitoring event occurs), select () is awakened (or polling is paused), so select () returnsNumber of descriptors meeting the readiness ConditionThe reason is the number, not just one, because multiple file descriptors may meet the readiness condition at the same time. Because only the number of returned file descriptors is not returned, the Macro function FD_ISSET is also used in the if statement in the loop structure to traverse until all the descriptors meeting the ready condition are found. Finally, the descriptor set is copied back to the user space through the specified function for processing by the process.

The general process of the listener descriptor set is shown in. select () is only one of the steps:

The following describes the process of Cyclic Monitoring:

(1). First, initialize the descriptor set through the FD_ZERO macro function. Each small square in the figure represents a file descriptor.
(2) create a descriptor set through the FD_SET macro function. At this time, all file descriptors in the set are opened, that is, the objects to be monitored by select () later.
(3). Use the select () function to monitor the descriptor set. When a file descriptor meets the ready condition, the select () function returns the number of conditions that meet the condition in the set. The yellow square in the figure indicates the descriptor that meets the ready condition.
(4). Use the FD_ISSET macro function to traverse the entire descriptor set and send the descriptor meeting the ready condition to the process. At the same time, use the FD_CLR macro function to remove the descriptors that meet the ready conditions from the set.
(5) Go to the next loop and continue to use the FD_SET macro function to add a new descriptor to the descriptor set. Repeat steps (3) and (4.

If you use simple pseudocode to describe:

FD_ZEROfor() {    FD_SET()    select()    if(){        FD_ISSET()        FD_CLR()    }    writen()}

The above is just an example of loop monitoring, but the specific method is not certain. However, we can also see this series of processes.

3.2 epoll

Epoll is more advanced than poll () and select (). Considering the following points, we can naturally see its advantages:

(1 ). the epoll instance created by epoll_create () can be used to add and delete file descriptors of interest at any time through epoll_ctl () use FD_SET to update the data structure of the descriptor set after each loop.
(2) When you create an epoll instance in epoll_create (), you also create an epoll ready linked list. Epoll_ctl () registers the callback function for this descriptor every time it adds a descriptor to the epoll instance. When the descriptors in the epoll instance meet the ready conditions, the callback function is triggered and moved to the ready list.
(3 ). when epoll_wait () is called for monitoring, it only needs to determine whether there is data in the ready linked list. If yes, it will be copied to the user space for processing by the process. If not, it will be blocked. Of course, if the Monitored object is set to non-blocking mode, it will not be blocked, but will continue to check.

In other words, epoll does not need to traverse the descriptor set at all.

Go back to the Linux series article outline: workshop!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.