A study of IO amplification in the directio of notes

Source: Internet
Author: User

the study of IO amplification in Directio

(IO test for reading a file:

In Symptom 1, the iops of the Iostat test is more than 100, and the ioPS of the test program test is 50, about half the iostat statistic. The number of IO that the Iostat tests out of ioPS minus the IOPS tested by the test program is called extra IO. Why is this.    Because reading a file reads a file's block of data and the index block, iostat the IO that reads both blocks, and the test program simply tests the IO that reads the data block. In Symptom 2, the Cat tests the device (refers to the disk) to read the amount of data is 77292KB, the test program test read the amount of data is 40000KB. The amount of data measured by cat tests minus the amount of data measured by the test program is called extra IO. Why is this. Because reading a file, the data block and index block of the file are read, cat will test the IO that reads both blocks, and the test program will only test the IO that reads the block of data. )

===================================

At this point, we not only carried out the test, further from the source analysis of the reasons for the test results, forming the following effective conclusions:
EXT3 uses a file index to find the location of a file (relative to the starting address of the partition where the file resides).

The offset position, in bytes or sectors. Additional IO is generated when reading a larger file at the offset location.

It's because reading the contents of a file first reads the offset of the file, how do you know the offset of a file?

Location, Ext3 uses the file index to find the location of a file, so it is reading the index block to get the file

The offset position (an entry in the index block contains information about the file (directory path + filename in the operating system) corresponding to the text

The physical address of the piece on disk, that is, when the file is read, the operating system is based on the directory path + filename in the file's operating system (which

Is information that is known to the operating system) in the index block to find the file corresponding to the physical address of the files on the disk, so in the read offset

The extra IO generated when a larger file is located (in order to get the file offset) reads the index block (refers to the disk

The index block on which to find the location of the file is not read on the buffer cache on memory,

There will be no additional IO generated, and the larger the offset, the more IO times.



Linux uses buffer cache ext3 index blocks (that is, directory paths + files in the operating system where files are stored)

A block of information that corresponds to the physical address of the file on the disk, to find the block of data (that is, the block that makes up the file) for storing the file

Data) location will first look for the index block in the cache, the cache misses need to read the index block from the disk; Linux takes page

Cache to buffer the data blocks of the files in the ext3.


The number of IO requests issued by the Linux kernel in Directio is actually related to the following factors:
a). The continuity of the logical block on the physical disk;

After the mapping completes (and each mapping completes) starts IO, here the IO operation is also more interesting, corresponding to the code third

Key point: Submit_page_section (). Generally speaking, the file system to do IO is mainly called the block device layer

Submit_bio interface, we only need to set the parameters. Submit_page_section () For the sake of efficiency, it realizes

Deferio, the core idea is: do not immediately initiate IO operation, but wait to see (submit_page_section () every execution will wait

For a specified time period, there are subsequent IO requests coming in and the requested data for this subsequent IO request and this IO please

If the requested data is contiguous, the subsequent IO request and this IO request are merged into an IO request; If time is up, not yet.

There are IO requests that can be merged with this IO request, then send this IO first to see if the subsequent IO requests are continuous (the so-called

Continuous IO refers to the continuous IO of the physical block number within the same page. If it does, we'll merge it in, or we'll send this IO first.

Go (call Submit_bio) in our context, the first IO request is located in Page1 block 0 and blocks 1 (physical chunk number

is 100 and 101), the second IO request is actually PAGE0 Block 2 (physical blocks 300), and the third IO request is actually Page1

Block 3, resulting in 3 IO requests.

Note: The logical block here refers to page pages on memory. Block (or block that makes up the page) that is contained in the page

Refers to a file system block.
B. Application buffer alignment granularity, as far as possible in programming to page_size alignment.


Comments:

How the application buffer size is set.
Or execute one such as the Read () function, the function opens up an application buffer of the size specified by the parameter value, executing

After the read () function is finished, the application buffer is destroyed. No, if the application is destroyed once after the read file

Buffer that cannot be read from the application buffer the next time the file is read, which does not reach the application buffer

The function is to make the next read without going to disk to read the file, but from the application buffer.

Directio has no file system cache, only application buffers.
===============================================
Memory cache page Size per page
File system block size in file system blocks (called clusters in Windows) on disk

Page size and file system block size are set independently, without correlation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.