I. Conceptual understanding
There are four types of IO in Linux: synchronous (Sync) and asynchronous (async), blocking (block) and non-blocking (unblock)
Sync: When a function call is made, it waits until the result is returned without getting the results.
Async: When an asynchronous procedure call is made, the caller cannot get the result immediately. Notifies the caller by notification mechanism or callback function after completion
Blocking: The current thread is suspended until the call results are returned (the thread goes into a non-executable state, in which case the CPU does not allocate a time slice to the thread, that is, the thread pauses to run). function returns only if the result is obtained
Note: Synchronization and blocking are different, for synchronous calls, threads are active, and when the caller waits, the thread can also handle other requests, while the blocking thread is suspended and no other requests are processed.
Non-blocking: The function does not block the current thread until the result is returned, and returns immediately
The difference between synchronous IO and asynchronous IP is that the process is blocked when the data is copied.
The difference between blocking IO and non-blocking IO is that the application's call returns immediately
Ii. Five types of I/O models under Linux
1. Blocking I/O (blocking I/O)
2. Non-blocking I/O (nonblocking I/O)
3. I/O multiplexing (I/O multiplexing)
4. Signal-driven I/O (signal driven I/O (SIGIO))
5. Asynchronous I/O (asynchronous I/O)
The first four kinds are synchronous, only the last is asynchronous IO
Blocking IO Model:
The process will block until the data copy is complete
The application calls an IO function that causes the application to block and wait for the data to be ready. After the data is ready, the IO function returns a successful instruction from the kernel to the user space. The blocking IO Model diagram looks like this:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/6D/56/wKioL1Vh7pGD5gdHAAFR1W_X-uc805.jpg "title=" Blockingio.jpg "alt=" Wkiol1vh7pgd5gdhaafr1w_x-uc805.jpg "/>
Non-blocking IO model
The IO function is called repeatedly through the process, and the process is blocked during the data copy process. The model diagram is shown below
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/6D/56/wKioL1Vh7rOy6uxnAAH49lPWTRo211.jpg "title=" Nonblockingio.jpg "alt=" Wkiol1vh7roy6uxnaah49lpwtro211.jpg "/>
IO multiplexing Model
Mainly select and Epoll, on an IO port, two calls, two returns, the key to achieve simultaneous monitoring of multiple IO ports. The model is shown below
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/6D/5A/wKiom1Vh7T3jLdZWAAHxERmSEqQ414.jpg "title=" Iomultiplexing.jpg "alt=" Wkiom1vh7t3jldzwaahxermseqq414.jpg "/>
Signal-driven IO
Two calls, return again
First we allow the socket interface for signal-driven IO, and install a signal processing function, the process continues to run and does not block. When the data is ready, the process receives a sigio signal that can be called by the IO function in the signal processing function to process the data, as shown in the following model
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/6D/56/wKioL1Vh7uSTXALmAAHS6q3U0-c898.jpg "title=" Signal-drivenio.jpg "alt=" Wkiol1vh7ustxalmaahs6q3u0-c898.jpg "/>
Asynchronous IO Model
The process is not blocked when copying data, the model is as follows
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/6D/5A/wKiom1Vh7WeSrRjlAAGRb3RPtkY224.jpg "title=" Asynchronousio.jpg "alt=" Wkiom1vh7wesrrjlaagrb3rptky224.jpg "/>
Comparison of 5 IO models
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6D/56/wKioL1Vh7wDSCcnCAAJBCklbZek834.jpg "title=" io.jpg "alt=" Wkiol1vh7wdsccncaajbcklbzek834.jpg "/>
If this model is difficult to understand, I use to go to the restaurant to eat noodles to explain, there is not in accordance with the place please understand:
Blocking IO: After going to the restaurant, I have to wait in the restaurant to finish it.
nonblocking IO: Go to the restaurant after the noodles, you can go out, but do not know what time is good, to take 1 minutes to see, in a busy wait, other things can not be done.
Multiplexing IO: Here is the equivalent of a restaurant added a waiter, to the restaurant, not to know the boss, but know the waiter, after the meeting in the store waiting for the waiter to inform the surface to do a good job, waiting for this period of time, the waiter can also entertain other personnel. The waiter told me to finish the noodles.
Signal-driven IO: In the restaurant after the noodles, you can go out, and so on, the boss will call to inform, but the face still want to bring their own side
Asynchronous IO: Go to the restaurant after the noodles, you can go out, before you go out to specify where you sit, and so on, the boss will face to your designated location, and then call to inform you
Three, select, poll, Epoll introduction
Epoll is unique to Linux, and select is a POSIX rule, and the general operating system is implemented.
Select: Find
Select Essence is the next process by setting or checking the data structure that holds the FD flag bit. The disadvantages are:
1, a single process can monitor the number of FD is limited, that is, the size of the listening port is limited.
Generally related to system memory, the specific number can be cat/proc/sys/fs/file-max. 32-bit default is 1024, 64-bit defaults to 2048
2, the socket scan is a linear scan, that is, the use of polling method, low efficiency.
When the socket is more, each time the select () to traverse the fd_setsize socket to complete the dispatch, regardless of whether the socket active is traversed. Will waste a lot of CPU time. If you can register a callback function with the socket, and when they are active, they automatically complete the operation, avoiding polling, which is what Epoll and Kqueue do.
3, the need to maintain a large number of FD data structure, will make the user space and kernel space in the transfer of the structure when the replication cost is large
Poll
Poll essence and select the same, copy the user's incoming data to the kernel space, and then query each FD corresponding device state, if the device is ready to add an item in the device waiting queue and continue to traverse, if not found ready device after traversing all FD, suspend the current process until the device is ready or the active timeout , and then iterate over the FD again after being awakened.
It does not have a limit of the maximum number of connections because it is stored based on a linked list, but the disadvantage is:
1, a large number of FD arrays are copied to the user state and the kernel space, whether there is no meaning.
2, poll also has a feature "level trigger", if the FD is reported, is not processed, then the next poll when the FFD is reported again.
Epoll:
The epoll supports both horizontal and edge triggering, and the biggest feature is the Edge trigger, which only tells which FD has just become ready and notifies only once. Another feature is that Epoll uses the "event" of the Ready notification method, through EPOLL_CTL registered FD, a volume of the FD is ready, the kernel will use a similar callback callback mechanism to activate the fd,epoll_wait can be notified.
Advantages of Epoll:
1. No limit for maximum concurrent connections
2, efficiency improvement, only active FD will call the callback function
3. Memory copy, use Mmap () file to map memory to accelerate message passing with kernel space.
Select, poll, Epoll difference summary:
1. Support One process open connection number
Select:32 bit machine 1024, 64 bit 2048
Poll: No limit, reason based on linked list storage
Epoll: There is a limit, but very large, 2G memory 20W or so
2. IO efficiency
Select:io Low Efficiency
Poll:io Low Efficiency
Epoll: Only active sockets are called Callback,io high efficiency.
3. Message Delivery method
Select: The kernel needs to pass the message to the user space, requiring the kernel copy action
Poll: Ibid.
Epoll: This is done by sharing a piece of memory with the user space through the kernel.
This article is from "snail" blog, please be sure to keep this source http://linuxkingdom.blog.51cto.com/6334977/1654813
Performance analysis of Linux five IO models