I/O models and five models
1.1 five I/O models
1) blocking I/O
2) non-blocking I/O
3) I/O multiplexing
4) event (signal)-driven I/O
5) asynchronous I/O
1.2 Why do we need to initiate a system call?
Because the process wants to obtain data in the disk, but can only deal with the disk is the kernel, the process notifies the kernel that it wants data in the disk
This process is called by the system.
1.3 steps for completing one I/O operation
When a process initiates a system call, the system call enters the kernel mode and starts the I/O operation.
I/O operations are divided into two steps:
1) the disk loads data into the memory space of the kernel.
2) data in the kernel's memory space is copied to the user's memory space (this process is where the real I/O occurs)
Note: Most io calls are blocked.
Process Analysis
The entire process: if this process requires operations on data in the disk, a system call is initiated to the kernel, and the process will be switched out,
This process will be suspended or sleep, or cannot be interrupted, because the data has not yet been obtained, only after the result of the system call is completed,
The process will be awakened and the subsequent operations will continue from the start of the System Call to the end of the system call:
① A process initiates a system call to the kernel,
② When the kernel receives a system call, it knows it is a file request, so it tells the disk to read the file
③ After the disk receives the kernel command, it loads the file into the memory space of the kernel.
④ After the kernel's memory space receives data, it copies the data to the memory space of the user's process (where I/O occurs)
⑤ Send a notification to the kernel after the process memory space obtains data
6. the kernel replies the received notifications to the process. This process is a wake-up Process. Then, the process obtains data and performs the next step.
2.1 Blocking
Before the result is returned, the current thread is suspended (the thread enters the sleep state). The function can continue execution only after the result is returned.
How does one notify the process of blocking the I/O system?
After I/O is complete, the system directly notifies the process, and the process is awakened.
The first stage is to load data to the memory space of the kernel.
The second stage is to copy the data in the kernel's memory space to the user's memory space (this is the real I/O operation)
2.2 Non-blocking
Non-blocking: The process initiates an I/O call. If I/O knows that it takes some time to complete, it immediately notifies the process to perform other operations, and the non-blocking I/O
How does the system notify the process of non-blocking I/O?
After the system completes, the process obtains the data and continues to execute the kernel data (this process is also called Blind wait)
Disadvantage: unable to process multiple I/O operations. For example, if the user opens a file and ctrl + C wants to terminate this operation, it cannot be stopped.
The first stage is to load data to the memory space of the kernel.
The second stage is to copy the data in the kernel's memory space to the user's memory space (this is the real I/O operation)
2.3 I/O multiplexing why I/O multiplexing is used for select
A process blocks multiple I/O instances. One process waits for input information from the keyboard, and the other is preparing to mount information from the hard disk.
For example, a read command calls an I/O operation. an I/O operation is completed, an I/O is not completed, the keyboard I/O is blocked, and the disk I/O is completed,
This process also cannot respond, because the keyboard io is not complete yet, it is still blocked, and the process is still sleeping. What should I do at this time?
This requires I/O multiplexing.
Execution Process
Later, when a process calls io, it does not directly call the io function. In the system kernel, a new system call is added to help the process monitor multiple io,
Once a process needs to be called by a system, it calls a special system of the kernel. When an application is initiated, the process will be blocked in the call of this multiplexing,
Therefore, reusing this function will monitor these io operations. When any io is completed, it will tell the process that an io is completed. If the process depends on an io operation,
At this time, the process can continue the subsequent operations. The tools that can help the process group monitor these io are called io musters.
I/O multiplexing in Linux
Select: it is an implementation. When a process needs to be called, the request can be sent to select. Multiple requests can be initiated, but up to 1024 requests can be sent. This is a congenital restriction.
Poll: no limit, but the performance of the remaining 1024 will decrease.
Therefore, in the early apache prefork mpm model, when the main process receives multiple user requests, the number of online requests exceeds 1024, and it does not work.
So Will io reuse be better than the first two?
The original process directly communicates with the system kernel, and adds an I/o reuse select in the middle. If it is a message or someone sends a message, what will this message end up?
Although the problem of multiple system calls is solved, the second half of the multiplexing itself is still blocked, blocking on the select, rather than blocking on the system call,
However, the second segment is still blocked. Because we want to scan multiple I/O operations and add a processing mechanism, the performance may not necessarily increase, and the performance may not change much.
The first stage is to load data to the memory space of the kernel.
In the second stage, data in the kernel's memory space is copied to the user's memory space (this is the real I/O operation). The 2.4 event driver
The process initiates a call. Through the callback function, the kernel will remember that the process was requested. Once the first step is completed, a notification can be sent to the process,
In this way, the first segment is non-blocking, and the process does not need to be blind, but the second segment is still blocking.
Event-driven)
It is precisely because of the event-driven mechanism that multiple requests can be simultaneously
For example, a web server. A process responds to multiple user requests.
Defect: The second paragraph is still blocked
Two mechanisms
If an event notifies a process that the process is busy and the process has not heard it, what should I do?
Horizontal trigger mechanism: the kernel notifies the process to read data. The process does not read data. The kernel needs to notify the process once.
Edge trigger mechanism: the kernel only notifies the process once to fetch data. During the timeout period, the process can fetch data at any time and send the event information status to the process, it is like sending a short message to a process,
Nginx
Nginx uses the edge trigger driver by default.
The first stage is to load data to the memory space of the kernel.
The second stage is to copy the data in the kernel's memory space to the user's memory space (this is the real I/O operation)
2.5 asynchronous AIO
No matter the second paragraph of the first paragraph, no feedback is given to the system call. Only after the data is completely copied to the service process memory can the service process be returned with OK information. Other times,
The process can do its own thing at will until the kernel notifies OK Information
Note: AIO can be implemented only in files, and Asynchronous Network IO cannot be implemented.
Nginx:
Nginxfile IO file asynchronous request
One process responds to N requests
Static file sector: Support for sendfile
Avoid wasting replication time: mmap supports memory ing. When the kernel memory is copied to the process memory, it does not need to be copied and is directly mapped to the process memory.
Edge triggering is supported.
Support asynchronous io
Solved the c10k Problem
C10k: 10 thousand concurrent connections at the same time
C100k: You know
The first stage is to load data to the memory space of the kernel.
The second stage is to copy the data in the kernel's memory space to the user's memory space (this is the real I/O operation)
The first four I/O models are synchronous operations, and the last AIO is asynchronous.
2.6 comparison of five models
Synchronization Blocking
The two segments are blocked, and the response is only returned after all data preparation is complete.
Non-blocking Synchronization
When the disk is copied from the disk to the kernel memory, the system keeps asking whether the kernel data is prepared.
Performance may be worse. It seems that he can do other things, but in fact he keeps repeating.
But there is still some flexibility.
Disadvantage: unable to process multiple I/O operations. For example, if the user opens a file and ctrl + C wants to terminate this operation, it cannot be stopped.
Synchronous IO
If the second part is blocked, it indicates synchronization.
The first, the second, io multiplexing, and event-driven are all synchronous.
Asynchronous IO
The kernel is automatically processed in the background and takes a lot of time to process user requests.
From Weizhi note (Wiz)