13.6 Bypass buffer cache: direct I/O
Starting with kernel 2.4,linux allows applications to bypass the buffer cache while performing disk I/O, passing data directly from user space to a file or disk device. This is sometimes referred to as direct I/O, or bare I/O (raw I/O).
The description here is specific to Linux, and SUSv3 does not regulate it. However, most UNIX implementations provide some form of direct I/O access to devices and files.
Direct I/O is sometimes mistaken for a means of obtaining fast I/O performance. For most applications, however, using direct I/O can significantly degrade performance. This is because the kernel optimizes the buffer cache for improved I/O performance, including sequential prefetching, performing I/O on clustered (clusters) disk blocks, and allowing multiple processes that access the same file to share cache buffers. Applications that use direct I/O will not benefit from these optimization initiatives. Direct I/O is only applicable to applications with specific I/O requirements. For example, the database system, with its cache and I/O optimization mechanisms, is self-contained, eliminating the need for the kernel to consume CPU time and memory to accomplish the same task.
Direct I/O can be performed against a single file or block device (for example, a disk). To do this, you need to specify the O_DIRECT flag when opening a file or device by calling open ().
The O_DIRECT flag is valid from the kernel 2.4.10 and not all Linux file systems and kernel versions support this flag. Most native (native) file systems support O_direct, but many non-UNIX file systems (such as VFAT) are not supported. For the attention of the file system, it is necessary to test the relevant (if the file system does not support O_direct, then open () will fail and return the error number Einval) or read the kernel source, in order to verify.
If a process opens a file with the O_direct flag, and another process opens the same file in a normal way (that is, using a cache buffer), there is no consistency between the data read and written by direct I/O and the contents of the buffer cache. This scenario should be avoided as far as possible.
The raw (8) man page describes an old technology that obtains original access to disk devices (now obsolete).
Alignment restrictions for direct I/O
Because direct I/O (for disk devices and files) involves direct access to the disk, some restrictions must be observed when I/O is executed.
The buffer used to pass data, whose memory bounds must be aligned to an integer multiple of the block size.
The starting point of the data transfer, i.e. the offset of the file and device, must be an integer multiple of the block size.
The length of the data to be passed must be an integer multiple of the block size.
Failure to comply with any of these restrictions will result in einval errors. In the list above, block size refers to the physical block size (typically 512 bytes) of the device.
When performing direct I/O, Linux 2.4 is more restrictive than Linux 2.6: alignment, length, and offset must be integral times the size of the underlying file system logical block. (A typical file system has a logical block size of 1024, 2048, or 4096 bytes.) )
Sample Programs
Listing 13-1 of the program provides a simple example of using the O_DIRECT flag to open a file to read data. The program can specify up to 4 command-line arguments, in turn, the file to read, the number of bytes to read from the file, the offset to locate (seek) in the file before reading, and the data buffer to be passed to read (). The last two are optional, with a default value of 0 bytes and 4096 bytes respectively. Here are some examples of running this program:
In Listing 13-1, the program uses the Memalign () function to allocate a chunk of memory that is aligned with the integer multiple of the first argument. The 7.1.4 section describes the Memalign () function.
Listing 13-1: Skipping buffer cache using O_direct