1. Kernel buffering of file I/O
Perhaps many beginner file I/O will assume that the read () or write () system call will directly initiate access to the files on the disk, but the two calls will only replicate the data between the user space buffer and the kernel's buffer zone.
write(fd,"12345");
For example, when write returns, the kernel will write (flush to) the disk at a later time (so the system call is not actually synchronized with the disk) if another program attempts to read these bytes during this time, the kernel cache will be read directly.
Similarly, for read from disk, the kernel reads the kernel buffer first, and the read call reads the data from there. For serialized file access, the kernel typically attempts to read ahead to ensure that the next data in the file is quickly read into the buffer before it is needed, and this design allows the operation of read () and write () to be accelerated
Linux kernel buffers are not fixed on-line, the kernel will allocate as much memory as possible, generally this is only limited by two factors. The total amount of physical memory available and other purposes for memory requirements
2. Impact of buffer size on I/O system calls
If there is no buffer, we can imagine the following read and write 1000 bytes, then the kernel will have to access 1000 of this disk, it is well known that the kernel access to the disk is very slow, then the size of the buffer to the operating time how much impact?
This table is then taken from the Linux system Programming Manual, which is to compare the time it takes to copy a 100M file by setting a different size buffer.
buffer_size |
total elapsed time (s) |
1 |
107.43 |
2 |
54.16 |
4 |
31.72 |
8 |
15.59 |
16 |
9.{ |
32 |
3.76 |
64 |
2.19 |
128 |
2.16 |
256 |
2.06 |
512 |
2.06 |
1024 |
2.05 |
4096 |
2.05 |
65536 |
2.06 |
The size of the buffer indicates a significant impact on file copy events. When the buffer size is 4096, it is almost optimal, exceeding this value, the effect is not significant
Buffering of 3.stdio Libraries
When working with disks, buffering large chunks of data to reduce the I/O of the system call C library is exactly what it does, so using C standard library I/O, the consumer does not have to handle the buffers themselves at all
(1) Setting a buffer mode for a stdio stream
#include<stdio.h>int setvbuf(FILE *stream,char *buf,int mode,ssize_t size);
Call the SETVBUF function to control the buffering mode of the Stdio library
The. Parameter stream indicates which file stream's buffering mode to modify
. When the parameter buf is not NULL, the buffer size of the stream is sized, and the buffer is allocated by the heap Rai
. BUF is null, buffer has stdio default assignment
The. Parameter mode specifies the buffer type
Mode value |
Specific Performance |
_ionbf |
Do not buffer I/O |
_iolbf |
Using row buffer I/O |
_iofbf |
With full buffered I/O |
#include<stdio.h>void setbuf(FILE *stream,char *buf);
. BUF is bull when no buffer is used, otherwise a buffer of buf size
(2) Refresh the stdio buffer
#include<stdio.h>int fflush(FILE *stream);
If the parameter stream is null, all stdio buffers are flushed
4. Controlling the kernel buffering of file I/O
Sometimes we need to force a flush of the kernel zone to the output file, such as the log process of the database, to ensure that the output is actually written to disk before proceeding
(1) Fsync ()
#include<unistd.h>int fsync(int fd);
The function is returned only if the transfer to the disk device is complete, otherwise it will be blocked
(2) Sync ()
#include<unistd.h>void sync(void);
The sync () call will flush all kernel buffers containing the file and the new information to disk
(3) O_sync flag for open () call
When Open is called, each write () call automatically flushes file data and metadata to disk if there is a O_sync flag
The O_SYNC flag has a huge impact on performance, and the subscript is from the Linux system programming manual
buf_size |
no O_sync (s) |
with O_sync (s) |
1 |
1030 |
98.8 |
16 |
65.0 |
0.40 |
256 |
4.07 |
0.03 |
4096 |
0.34 |
0.03 |
5. Summary
File I/O buffering is primarily through the Stdio library to the user data to the stdio buffer, which is in the user-state memory area, when the buffer fills up, stdio calls the Write () system call, the data to the kernel to tell the buffer, after a certain time, the kernel initiates disk operations, Upload data to disk.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Understanding of File i/0 buffering