Understanding of File i/0 buffering

Source: Internet
Author: User

1. Kernel buffering of file I/O

Perhaps many beginner file I/O will assume that the read () or write () system call will directly initiate access to the files on the disk, but the two calls will only replicate the data between the user space buffer and the kernel's buffer zone.

write(fd,"12345");

For example, when write returns, the kernel will write (flush to) the disk at a later time (so the system call is not actually synchronized with the disk) if another program attempts to read these bytes during this time, the kernel cache will be read directly.

Similarly, for read from disk, the kernel reads the kernel buffer first, and the read call reads the data from there. For serialized file access, the kernel typically attempts to read ahead to ensure that the next data in the file is quickly read into the buffer before it is needed, and this design allows the operation of read () and write () to be accelerated

Linux kernel buffers are not fixed on-line, the kernel will allocate as much memory as possible, generally this is only limited by two factors. The total amount of physical memory available and other purposes for memory requirements

2. Impact of buffer size on I/O system calls

If there is no buffer, we can imagine the following read and write 1000 bytes, then the kernel will have to access 1000 of this disk, it is well known that the kernel access to the disk is very slow, then the size of the buffer to the operating time how much impact?

This table is then taken from the Linux system Programming Manual, which is to compare the time it takes to copy a 100M file by setting a different size buffer.

buffer_size total elapsed time (s)
1 107.43
2 54.16
4 31.72
8 15.59
16 9.{
32 3.76
64 2.19
128 2.16
256 2.06
512 2.06
1024 2.05
4096 2.05
65536 2.06

The size of the buffer indicates a significant impact on file copy events. When the buffer size is 4096, it is almost optimal, exceeding this value, the effect is not significant

Buffering of 3.stdio Libraries

When working with disks, buffering large chunks of data to reduce the I/O of the system call C library is exactly what it does, so using C standard library I/O, the consumer does not have to handle the buffers themselves at all

(1) Setting a buffer mode for a stdio stream
#include<stdio.h>int setvbuf(FILE *stream,char *buf,int mode,ssize_t size);

Call the SETVBUF function to control the buffering mode of the Stdio library
The. Parameter stream indicates which file stream's buffering mode to modify
. When the parameter buf is not NULL, the buffer size of the stream is sized, and the buffer is allocated by the heap Rai
. BUF is null, buffer has stdio default assignment
The. Parameter mode specifies the buffer type

Mode value Specific Performance
_ionbf Do not buffer I/O
_iolbf Using row buffer I/O
_iofbf With full buffered I/O
#include<stdio.h>void setbuf(FILE *stream,char *buf);

. BUF is bull when no buffer is used, otherwise a buffer of buf size

(2) Refresh the stdio buffer
#include<stdio.h>int fflush(FILE *stream);

If the parameter stream is null, all stdio buffers are flushed

4. Controlling the kernel buffering of file I/O

Sometimes we need to force a flush of the kernel zone to the output file, such as the log process of the database, to ensure that the output is actually written to disk before proceeding

(1) Fsync ()
#include<unistd.h>int fsync(int fd);

The function is returned only if the transfer to the disk device is complete, otherwise it will be blocked

(2) Sync ()
#include<unistd.h>void sync(void);

The sync () call will flush all kernel buffers containing the file and the new information to disk

(3) O_sync flag for open () call

When Open is called, each write () call automatically flushes file data and metadata to disk if there is a O_sync flag

The O_SYNC flag has a huge impact on performance, and the subscript is from the Linux system programming manual

buf_size no O_sync (s) with O_sync (s)
1 1030 98.8
16 65.0 0.40
256 4.07 0.03
4096 0.34 0.03
5. Summary

File I/O buffering is primarily through the Stdio library to the user data to the stdio buffer, which is in the user-state memory area, when the buffer fills up, stdio calls the Write () system call, the data to the kernel to tell the buffer, after a certain time, the kernel initiates disk operations, Upload data to disk.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Understanding of File i/0 buffering

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.