Linuxio refresh mechanism Fsync and Fdatasync detailed

Source: Internet
Author: User

Objective:

The Linux,unix has buffer buffering or page caching in the kernel, and most disk I/O are buffered and deferred write technology is used.
Sync: Queues all modified cache buffers to the write queue and returns without waiting for the actual write disk operation to end
Fsync: only works with a single file that has a file descriptor, and waits for some disk operation to end and then returns.
Fdatasync: Similar to Fsync, but it only affects the data portion of the file. Fsync also synchronizes the properties of the updated file.
Fflush: Standard I/O functions (such as: fread,fwrite) buffer in memory, which refreshes the memory buffer, writes the content to the kernel buffer, and calls Fsync to write to the disk. (Call Fsync after calling Fflush, otherwise it will not work).


When the function write () is described earlier, we think that once the function returns, the data is written to the file. But this concept is only macro-level. In fact, when the operating system implements some file I/O (such as a disk file), in order to ensure the efficiency of I/O, the kernel typically uses a dedicated area (memory or separate I/O address space) as the I/O data buffer. The application can view this kernel area as a fast transit point for I/O data (Figure 3-5). When the write () function is called to write data, once the data is written to the buffer, the function returns immediately. The data you write at this time can be read back with read () or read by other processes, but it does not mean that they have been written to an external persistent storage medium, even if close () is called after closing the file. The data in the kernel I/O data buffer is transmitted only at the appropriate time by the operating system boot peripheral, and the true transfer action is done by the CPU-independent peripheral controller or the peripheral itself (known as the DMA engine by Linux). Therefore, from the point of view where the data is actually written to disk, the file data written out with write () is not fully synchronized with the external storage device. In modern computer systems, this unsynchronized interval is very short, typically only a few seconds or more than 10 seconds, depending on the amount of data written out and the state of the I/O data buffer. Although the time interval for non-synchronization is short, if a power-down or a system crash occurs during this period, it can result in the loss of written data when it is too late to write to the disk.

because modern computers are usually very stable, there are very few cases of power-down or system crashes, so most applications can ignore this momentary out-of-sync when writing files. However, some applications have such synchronization points where the data written at these points is critical, or the consistency of the files must be ensured in a timely manner. In case of a case, these applications need to ensure that all written data has been transmitted to external persistent storage media. To this end,UNIX provides two ways to achieve this. One way is to set the O_SYNC flag (table 3-1) on the file so that each write data is written directly to disk. If this flag is set, the write () call will not return until the data has been safely written to disk (not just the system's I/O buffers). However, it is less efficient to keep the data synchronized each time it is written.


Another method is to call the function Fsync () or Fdatasync () only when needed .
#include <unistd.h>
int fsync (int fildes);
int fdatasync (int fildes)

Fsync () forces all modified data (including data in the in-core I/O buffers) in the Fildes attached file to be transferred to the external persistent media, that is, all information about the file that is Fildes given is refreshed. The process calling Fsync () will block until the device reports that the transfer has completed. "All modified data" here includes the data written by the user and the characteristic data of the file itself (4.1.1 section and table 4-1), such as the access time of the file, the modification time, the owner of the file, etc.

The function of Fdatasync () is similar to Fsync (), except that it only forces the transfer of the data written by the user to the physical storage device, excluding the characteristic data of the file itself. This reduces the amount of data that can be transferred when the file is refreshed properly. However, some systems do not support Fdatasync (), and on such systems, Fdatasync () is equivalent to Fsync ().

After a program writes out the data, it should call Fsync () if it is required to ensure that the written data has been written to disk before proceeding with subsequent processing. For example, a database application typically calls Fsync () while calling write () to save critical transaction data.

We discussed the problem of standard I/O stream buffers and function fflush () in section 2.7. So, what is the difference between these two buffers? The answer is that the kernel I/O buffers are space managed by the operating system, and the stream buffers are user spaces managed by the standard I/O library. Fflush () refreshes only the stream buffers that are in user space. After Fflush () returns, only the data is guaranteed to be out of the stream buffer, and there is no guarantee that they will be written to disk. At this point, the data flushed from the stream buffer may have been written to disk, or it may be in the kernel I/O buffer. To make sure that the data written out by the stream I/O is write to disk, you should also call Fsync () after calling Fflush ().

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.