UNIX network programming I/O model chapter 6

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The premise is that,

When receiving a receipt or reading data, take two steps.

1. Wait for the data to be ready.

2. Copy data from the kernel to the process.

For a network io (socket) (Here we use read as an example), it involves two system objects, one is to call the IO process (or thread ), the other is the system kernel ). When a read operation occurs, it goes through two phases:
1. Waiting for data preparation (waiting for the data to be ready)
2. Copy data from the kernel to the process (copying the data from the kernel to the process)
It is important to remember these two points, because the differences between these Io models are different in the two phases.

There are five Io models.

? Blocking I/O blocking Io
? Nonblocking I/O non-blocking Io
? I/O multiplexing (select and poll) Io multiplexing
? Signal driven I/O (sigio) signal-driven Io
? Asynchronous I/O (the POSIX aio_functions) asynchronous Io

1 blocking Io
In Linux, all sockets are blocking by default. A typical read operation process is like this:

When the user process calls the recvfrom system call, the kernel starts the first stage of IO: Prepare data. For Network Io, data has not arrived at the beginning (for example, a complete UDP packet has not yet been received). At this time, the kernel will wait for enough data to arrive. On the user process side, the whole process will be blocked. When the kernel waits until the data is ready, it will copy the data from the kernel to the user memory, and then the kernel returns the result, the user process will unblock the status and run it again.
Therefore, blocking Io is characterized by being block in both stages of Io execution.

In addition, recvfrom knows that the data is ready and copied from the kernel to the process, or an error is returned. When a common error occurs, the system call is interrupted, that is, recvfrom is a slow system call.

Almost all the network programming that programmers first came into contact with started from interfaces such as listen (), send (), and Recv. By using these interfaces, you can easily build server/client models.

We assume that we want to create a simple server program to provide a single client with a content service similar to "one question and one answer.

Figure 1. Simple Server/client Model

We noticed that most of the socket interfaces are blocking. The so-called blocking interface means that a system call (generally an I/O interface) does not return the call result and keeps the current thread congested. It is returned only when the system call gets the result or times out and an error occurs.

In fact, almost all I/O interfaces (including socket interfaces) are blocked unless otherwise specified. This brings a big problem to network programming. For example, when sending () is called, the thread will be blocked. During this period, the thread cannot perform any operations or respond to any network requests. This poses a challenge to network programming with multiple clients and multiple business logic. At this time, many programmers may choose multiple threads to solve this problem.

A simple improvement solution is to use multiple threads (or multiple processes) on the server side ). Multi-thread (or multi-process) is designed to give each connection an independent thread (or process), so that the blocking of any connection will not affect other connections. The specific use of multi-process or multi-thread does not have a specific mode.Traditionally, the overhead of a process is much greater than that of a thread. If you need to provide services for a large number of clients at the same time, multi-process is not recommended; if a single service execution body consumes a large amount of CPU resources, such as large-scale or long-term data operations or file access, the process is safer.Generally, use pthread_create () to create a new thread and fork () to create a new process.
We assume that we have higher requirements for the above server/client model, that is, to allow the server to provide Q & A services for multiple clients at the same time. So we have the following model.

Figure 3 multi-threaded Server Model

In the preceding thread/time legend, the main thread continues to wait for client connection requests. If there is a connection, a new thread is created and the same Q & A service is provided for the queue in the new thread.
Many beginners may not understand why a socket can be accept multiple times. In fact, the socket designer may leave a foreshadowing for multiple clients, so that accept () can return a new socket. The following is a prototype of the accept interface:
Int accept (int s, struct sockaddr * ADDR, socklen_t * addrlen );
The input parameter S is the socket handle value inherited from socket (), BIND (), and listen. After BIND () and listen () are executed, the operating system has started to listen to all connection requests at the specified port. If there is a request, the connection request is added to the Request queue. The accept () interface is called to extract the first connection information from the request queue of socket S and create a new socket return handle similar to that of socket S. The new socket handle is the input parameter of the subsequent read () and Recv. If no request is sent to the Request queue, accept () is blocked until a request enters the queue.
The preceding multi-threaded server model seems to perfectly meet the requirements of providing Q & A services for multiple clients, but it is not all at all. If you want to respond to hundreds of thousands of connection requests at the same time, both multi-thread and multi-process will occupy system resources and reduce system response efficiency to the outside world, the thread and process itself are more likely to enter the false state.
Many programmers may consider using"Thread Pool" or "connection pool". The "Thread Pool" is designed to reduce the frequency of thread creation and destruction. It maintains a reasonable number of threads and allows Idle threads to re-undertake new execution tasks. The "connection pool" maintains the connection cache pool, reusing existing connections as much as possible, and reducing the frequency of creating and disabling connections. Both technologies can effectively reduce system overhead and are widely used in many large systems, such as websphere, tomcat, and various databases. However, the "Thread Pool" and "connection pool" technologies only alleviate the resource occupation caused by frequent IO Interface calls. And,The so-called "pool" always has its upper limit. When the request exceeds the upper limit, the system composed of "pool" does not respond much better to the outside world than when there is no pool. Therefore, when using the "pool", you must consider the response scale and adjust the size of the "pool" based on the response scale.
Corresponding to thousands or even tens of thousands of client requests that may occur at the same time in the preceding example, the "Thread Pool" or "connection pool" may relieve some of the pressure, but it cannot solve all the problems. In short, the multi-threaded model can easily and efficiently solve small-scale service requests. However, in the face of large-scale service requests, the multi-threaded model may also encounter bottlenecks. You can try to solve this problem by using non-blocking interfaces.

2 non-blocking Io

In Linux, you can set socket to non-blocking. When a read operation is performed on a non-blocking socket, the process looks like this:

It can be seen that when a user process sends a read operation, if the data in the kernel is not ready, it does not block the user process, but immediately returns an error. From the perspective of the user process, after initiating a read operation, it does not need to wait, but immediately gets a result. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the data in the kernel is ready and the system call of the user process is received again, it immediately copies the data to the user memory and returns it.
Therefore, the user process needs to actively ask about the kernel data. It is the round training (polling), which is a great waste of CPU.

Therefore, the non-blocking Io feature is that the data is blocked while the data is being copied.

The above problems are caused by the blocking feature of Io interfaces to some extent. Multithreading is a solution. Another solution is to use non-blocking interfaces.

Non-blocking interfaces are significantly different from blocking interfaces in that they are returned immediately after being called. You can use the following function to set a handle FD to a non-blocking state.

fcntl( fd, F_SETFL, O_NONBLOCK );

The following describes how to use only one thread, but check whether data is delivered from multiple connections at the same time and accept data.

Figure 3. Use a non-blocking receiving data model

In the non-blocking status, the Recv () interface returns immediately after being called. The returned value indicates different meanings. In this example,

The value of Recv () is greater than 0, indicating that the data is accepted. The returned value is the number of bytes received;
Recv () returns 0, indicating that the connection has been normally disconnected;
Recv () returns-1, and errno is equal to eagain, indicating that the Recv operation has not been completed;
-1 is returned for Recv (), and errno is not equal to eagain, which indicates that the Recv operation encounters a system error errno.

The server thread can call the Recv () interface cyclically to receive data from all connections in a single thread.

However, the above models are never recommended.Because loop calling Recv () will greatly increase the CPU usage. In this solution, Recv () is more used to detect whether the operation is complete, the actual operating system provides more efficient interfaces for detecting whether an operation is completed, such as select ().

3 Io multiplexing

The basic principle of this function is that the select/epoll function will continuously poll all the sockets in charge. When a socket has data, it will notify the user process. Its Process

When a user process calls select, the entire process will be blocked. At the same time,The kernel will "Monitor" all the sockets that the select is responsible for. When the data in any socket is ready, the Select will return. At this time, the user process then calls the read operation to copy data from the kernel to the user process. This graph is not much different from the blocking Io graph. In fact, it is worse. Because two system calls (select and recvfrom) need to be used here, while blocking Io only calls one System Call (recvfrom ). However, the advantage of using select is that it can process multiple connections at the same time. (More. Therefore, if the number of connections to be processed is not very high, the web server using select/epoll may not have better performance than the Web server using multi-threading + Blocking Io, and may have a greater latency. The advantage of select/epoll is not that it can process a single connection faster, but that it can process more connections .)
In the I/O multiplexing model, in practice, each socket is generally set to non-blocking. However, as shown in, the entire user's process is always blocked. However, process is block by the Select function, rather than block by socket Io.

Therefore, the IO multiplexing model is characterized by blocking in both stages, but waiting for data to be blocked on the select, and copying data to the recfrom.

In the multiplexing model, each socket is generally set to non-blocking. However, as shown in, the entire user's process is always blocked.However, process is block by the Select function, rather than block by socket Io. Therefore, select () is similar to non-blocking Io.

Most Unix/Linux systems support the select function, which is used to detect the status changes of multiple file handles. The following is a prototype of the Select Interface:
Fd_zero (int fd, fd_set * FDs)
Fd_set (int fd, fd_set * FDs)
Fd_isset (int fd, fd_set * FDs)
Fd_clr (int fd, fd_set * FDs)
Int select (INT NFDs, fd_set * readfds, fd_set * writefds, fd_set * limit TFDs,
Struct timeval * timeout)
Here, the fd_set type can be simply understood as a queue that marks the handle by bit. For example, to mark a handle with a value of 16 in an fd_set, the 16th bits of the fd_set are marked as 1. You can use a macro such as fd_set and fd_isset to implement specific positions and verification. In the select () function, readfds, writefds, and limit TFDs are both input and output parameters. If the input readfds is marked with a 16th handle, select () checks whether the 16th handle is readable. After the select () Statement is returned, you can check whether readfds has a handle number 16 to determine whether the "readable" event has occurred. In addition, you can set the timeout time.
Next, we will re-simulate the model for receiving data from multiple clients in the previous example.

Figure 7 receiving data model using select ()

The model only describes the process of using the select () interface to receive data from multiple clients at the same time () the interface can detect the read, write, and error statuses of multiple handles at the same time, so it is easy to build a server system that provides independent Q & A services for multiple clients. For example.

Figure 8 event-driven server model using the select () Interface

It should be noted that a connect () operation on the client will trigger a "readable event" on the server, so select () can also detect the connect () behavior from the client..
In the above model, the most important thing is how to dynamically maintain the three select () parameters readfds, writefds, and limit TFDs. As the input parameter, readfds should mark all the "readable event" handles to be detected, which will always include the "parent" handle of the test connect (). At the same time, writefds and writable TFDs should mark all the "writable events" and "error events" handles to be detected (marked using fd_set ).
As an output parameter, the handles of all events captured by select () are saved in readfds, writefds, and limit TFDs. The programmer needs to check all the flag bits (using fd_isset () to determine which handles have an event.
The above model mainly simulates the "one-question-one-answer" service process. Therefore, if select () discovers that a handle captures a "readable event", the server program should perform Recv () operations in a timely manner, prepare the data to be sent based on the received data, add the corresponding handle value to writefds, and prepare the select () test for the next "writable event. Similarly, if select () finds that a handle captures a "writable event", the program should promptly perform the send () operation and prepare for the next "readable event" detection. Describes an execution cycle in the preceding model.

Figure 9 Execution cycle of a multiplexing Model

This model features that each execution cycle detects one or more events, and a specific event triggers a specific response. We can classify this model as"Event-driven model".
Compared with other models, the Select () event-driven model only uses a single thread (process) for execution, consumes less resources, does not consume too much CPU, and can provide services for multiple clients. If you try to build a simple event-driven server program, this model has some reference value.
However, this model still has many problems.First, the Select () interface is not the best choice to implement "event-driven. When the handle value to be tested is large, the Select () interface itself consumes a lot of time to poll each handle.Many operating systems provide more efficient interfaces, such as epoll in Linux, kqueue in BSD, And/dev/poll in Solaris ,.... If you want to implement more efficient server programs, Apis like epoll are more recommended. Unfortunately, the epoll interfaces provided by different operating systems are quite different. Therefore, it is difficult to use the epoll-like interfaces to implement servers with better cross-platform capabilities.
Secondly, this model integrates event detection and Event Response. Once the execution body of the Event Response is large, it is disastrous for the entire model.As shown in the following example, the large execution body 1 will directly lead to a delay in the execution body responding to event 2, and greatly reduce the timeliness of event detection.

Figure 10 impact of a large execution body on the event-driven model using select ()

Fortunately, there are many efficient event-driven libraries that can block the above difficulties. Common event-driven libraries includeLibevent LibraryAs a replacement for libevent.Libev Library. These libraries will select the most appropriate event detection interface based on the characteristics of the operating system, and add signal and other technologies to support asynchronous response, this makes these libraries the best choice for building an event-driven model. The following chapter describes how to use the libev database to replace the select or epoll interface to implement an efficient and stable server model.

In fact, the Linux kernel has introduced Io operations that support asynchronous responses since 2.6, such as aio_read and aio_write. This is asynchronous Io.

4 signal driven I/O (sigio)

We can also use signals, telling the kernel to handle Y us with the sigio signal when
Descriptor is ready. We call this signal-driven I/O and show a summary of it in Figure 6.4.

We first enable the socket for signal-driven I/O (as we will describe in section 25.2) and
Install a signal handler using the sigaction system call. The return from this system call is
Immediate and our process continues; it is not blocked. When the datax is ready to be
Read, the sigio signal is generated for our process. We can either read the datax from
The signal handler by calling recvfrom and then between y the main loop that the data is ready
To be processed (this is what we will do in section 25.3), or we can perform y the Main Loop
And let it read the datax.
Regardless of how we handle the signal, the advantage to this model is that we are not

Blocked while waiting for the datax to arrive. The main loop can continue executing and
Just wait to be notified by the signal handler that either the data is ready to process or
Datasync is ready to be read.

So,Signal driven I/OThe feature is that the first process is not blocked. when the data is ready, the sigio will be used to notify the process, and the copy data is blocked on the recfrom. The advantage is that the process can continue to execute.

Note:Signal driven I/O is similar to the IO multiplexing model, but it is a blocking, passive wait, and one will be notified.

5 asynchronous I/O

In Linux, asynchronous Io is rarely used. Let's take a look at its process:

After the user process initiates the read operation, it can immediately start to do other things. On the other hand, from the perspective of kernel, when it receives an asynchronous read, it will first return immediately, so it will not generate any block to the user process. Then, the kernel will wait for the data preparation to complete and then copy the data to the user memory. After all this is done, the kernel will send a signal to the user process to tell it that the read operation is complete.

Differences between blocking and non-blocking

Calling blocking Io will block the corresponding process until the operation is completed, while non-blocking Io will return immediately when the kernel still prepares data. However, both of them are blocked when copying data from the kernel to the application. ,

Differences between synchronous Io and asynchronous Io

Before describing the differences between synchronous Io and asynchronous Io, you must first define the two. The definitions provided by Steven S (actually the POSIX definition) are like this:
A synchronous I/O operation causes the requesting process to be blocked until that I/O operationcompletes;
An asynchronous I/O operation does not cause the requesting process to be blocked;
The difference between the two is that synchronous Io blocks the process when performing "Io operation". Io operation includes two processes: Waiting for Data + data copying. Data Copying of blocking and noblocking must be blocked. According to this definition, the blocking Io, non-blocking Io, and Io multiplexing described earlier belong to synchronous Io. Some may say that non-blocking Io is not blocked. Here is a very "Tricky" place. The "Io operation" in the definition refers to the real Io operation, that is, the recvfrom system call in the example. When non-blocking Io executes the recvfrom system call, if the kernel data is not ready, the process will not be blocked. However, when the data in the kernel is ready, recvfrom will copy the data from the kernel to the user memory. At this time, the process is blocked. During this time, the process is blocked. Asynchronous Io is different. When a process initiates an I/O operation, it directly returns the result and ignores it again until the kernel sends a signal telling the process that I/O is complete. In this process, the process is not blocked at all.

Comparison of Io models:

After the above introduction, we will find that the difference between non-blocking Io and asynchronous Io is quite obvious. In non-blocking Io, although the process is not blocked for most of the time, it still requires the process to take the initiative to check, and after the data preparation is complete, the process also needs to actively call recvfrom again to copy data to the user memory. Asynchronous Io is completely different. It is like a user process handing over the entire Io operation to another person (kernel) to complete, and then the other person will send a signal after completion. During this period, the user process does not need to check the I/O operation status or actively copy data.

References:
Io-synchronous, asynchronous, blocking, non-blocking: http://blog.csdn.net/historyasamirror/article/details/5778378

Using event-driven models for efficient and stable network server programs: http://www.ibm.com/developerworks/cn/linux/l-cn-edntwk/

Select Function

1) functions:

Allows a process to indicate that the kernel waits for one of multiple events, and only one or more events occur, or wake it up regularly. (This is the case when both socket descriptor processing and waiting for user input are done as mentioned above) (the implementation of Berkeley allows I/O multiplexing of any descriptor)

2) Function Definition

#include <sys/time.h>#include <sys/select.h>int select (int maxfdp1, fd_set *readset, fd_set *writeset, fd_set exceptset, const struct timeval *timeout);struct timeval {    long tv_sec;    long tv_usec;}

Parameter introduction:

Timeout: indicates the kernel waiting for any time in the specified descriptor. There are three possibilities:

- Wait forever. The value is set to null;
- Wait for a fixed period of time. That is, the waiting time cannot exceed the specified Timeout time value;
- Don't wait. In this case, it is called polling. The total seconds and nuances of the timeout structure are set to 0;

Effectset: currently, only two exception conditions are supported.

- Arrival of out-of-band data of a socket (discussed in chapter 24 );
- A Pseudo Terminal that has been set to the group mode has control status information that can be read from its master terminal (this book does not cover)

Readset, writeset: the descriptor to be read and written by the kernel;

Maxfdp1: specifies the number of descriptors to be tested, starting from 0 to the maxfdp1-1 (fd_setsize is often the total number of fd_set descriptors ).

Returned value: if there is a ready descriptor, it is the number. If the time-out period is reached, 0 is returned, and the error is-1.

Select can be used as a timer. At this time, the three descriptor sets in the middle are set to null. This timer is more precise than sleep, and sleep is measured in seconds, in subtle units.

3) fd_set operations

Fd_set rset; // note that the newly defined variables must be initialized with fd_zero. The automatically assigned values are unpredictable and may cause unexpected consequences. Void fd_zero (fd_set * fdset); // initialize the set: all bits offvoid fd_set (int fd, fd_set * fdset ); // turn on the bit for FD in fdsetvoid fd_clr (int fd, fd_set * fdset); // turn off the bits for FD in fdsetint fd_isset (int fd, fd_set * fdset ); // is the bit for FD on in fdset

4) socket Preparation Conditions

Prepare the read conditions for the socket

A) The number of data bytes in the socket accept buffer is greater than or equal to the current size of the low-level mark in the socket accept buffer. The read operation on such a socket will not be blocked and a value greater than 0 will be returned (also

Is to return the data to be read ). We can useSo_rcvlowatThe socket option sets the low-level mark of the socket. For TCP and UDP sockets, the default value is 1.

B) Close the read half of the socket (that is, accept the TCP connection of fin ). Read operations on such sockets will not be blocked and 0 will be returned (that is, EOF will be returned)

C) the socket is a listening socket (that is, the socket has used listen. After the listen function is called, a socket will be changed from the socket that is actively connected to a listening socket, active socket by default) and the number of completed connections is not 0. The accept for such a socket is usually not blocked. As described above"Select () can also detect connect () behavior from the client"(You can post a document later to introduce a timing condition for blocking accept)

D) There is a socket error to be processed. The read operation on such a socket will not block and return-1 (that is, an error is returned), and errno is set to an exact error condition. In this way, pending error can also be obtained and cleared by calling getsockopt by specifying the so_error socket option.

Prepare write conditions for the socket

A) The number of available space bytes in the socket sending buffer is greater than or equal to the current size of the low-level mark in the socket sending buffer, and the socket is connected, or the socket does not need to be connected (such as UDP socket

). This means that if we set such a socket to non-blocking, the write operation will not block and return a positive value (such as the number of bytes accepted by the transport layer ). We can use the so_sndlowat socket option to set

Set the low-level mark of the socket. For TCP and UDP, its default value is usually 2048.

B) The write part of the connection is closed. Write operations on such a socket will generate a sigpipe signal.(The connection is established. If one end closes the connection and the other end still writes data to it, the first time the data is written, it will receive the response from the peer rst and then write the data, the kernel sends a sigpipe signal to the process, notifying the process that the connection has been disconnected. The default processing of sigpipe signals is to terminate the program ,)

C) The non-blocking connect socket has established a connection, or the connection has ended in failure. (For a blocking socket, calling the connect function will stimulate the TCP three-way handshake process and return only when the connection is established successfully or an error occurs. For a non-blocking socket, if the connect function is called,-1 (indicating an error) is returned, and the error is einprogress, indicating that the connection is established, started, but not completed. If the return value is 0, the connection has been established, this usually occurs when the server and the customer are on the same host)

D) There is a socket error to be processed. Write operations on such a socket will not be blocked and return-1 (that is, an error is returned), and errno is set to an exact error condition. These pending errors

You can also call getsockopt to obtain and clear so_error socket.

Note: When an error occurs on a socket, it marks select as readable and writable.

UNIX network programming I/O model chapter 6

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

UNIX network programming I/O model chapter 6

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

UNIX network programming I/O model chapter 6

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support