Disk I/O performance monitoring metrics and tuning methods

Source: Internet
Author: User
Tags disk usage

Before we introduce the disk I/O monitoring commands, we need to understand the metrics for disk I/O performance monitoring, and the performance of some aspects of the disk revealed by each indicator. Indicators for disk I/O performance monitoring mainly include:
Indicator 1: Number of I/Os per second (IOPS or TPS)
For disks, the continuous read or continuous write of a disk is called a disk I/O, and the IOPS of the disk is the sum of the number of consecutive reads and successive writes per second. This indicator has important reference significance when transferring small block discontinuous data.
Indicator 2: Throughput (throughput)
Refers to the speed at which a hard disk transmits data, and transmits data to and from the data being read and written. Its units are generally Kbps, MB/s, and so on. When transmitting large chunks of discontinuous data, the indicator has important reference functions.
Indicator 3: Average I/O data size
The average I/O data size is the throughput divided by the number of I/OS, which is significant for revealing disk usage patterns. In general, if the average I/O data size is less than 32K, it is considered that the disk usage mode is mainly random access, and if the average size of I/O data is greater than 32K, the disk usage mode is considered sequential access.
Indicator 4: Percentage of disk activity time (utilization)
The percentage of disk active time, that is, disk utilization, and the disk is active when data transfer and processing commands (such as Seek). Disk utilization is proportional to the level of resource contention, inversely related to performance. That is, the higher the disk utilization, the more serious the resource contention, the worse the performance, the longer the response time. In general, if disk utilization exceeds 70%, the application process will take a long time to wait for I/O to complete, as most processes will be blocked or hibernate during the wait.
Indicator 5: Service time
Refers to the time when disk reads or writes are performed, including seek, rotational delay, and data transmission. Its size is generally related to disk performance, cpu/memory load will have an impact on it, too many requests will indirectly lead to increased service time. If the value continues to exceed 20ms, it can generally be considered to have an impact on the upper application.
Indicator 6:i/o Wait Queue Length (queue lengths)
Refers to the number of I/O requests to be processed, which increases if the I/O request pressure continues to exceed the disk processing power. If the queue length of a single disk continues to exceed 2, it is generally considered that the disk has an I/O performance issue. Note that if the disk is a virtual logical drive for a disk array, you need to divide the value by the number of actual physical disks that make up the logical drive to obtain an I/O waiting queue length for the average single hard drive.
Indicator 7: Latency (wait time)
Refers to the time that a disk read or write operation waits for execution, that is, the time that is queued in the queue. If I/O requests continue to exceed disk processing power, it means that I/O requests that are too late to process have to wait longer in the queue.


By monitoring the above metrics and comparing these metrics to historical data, empirical data, and disk nominal values, it is not difficult to identify potential or already occurring problems with disk I/O when necessary, using the CPU, memory, and swap partitions. But how to avoid and solve these problems? This requires knowledge and technology for disk I/O performance optimization. Limited to the topic and length of this article, only a few commonly used optimization methods for readers to reference:

1. Adjust the data layout to allocate the I/O requests to all physical disks as much as possible.
2. For a RAID disk array, try to make the application I/O equal to the stripe size or as a multiple of the stripe size. and select the appropriate RAID mode, such as RAID10,RAID5.
3. Increase the queue depth of the disk driver, but do not exceed the processing power of the disk, otherwise, some I/O requests will be re-emitted because of loss, which will degrade performance.
4. Application caching technology reduces the number of times an application accesses a disk, and caching technology can be applied at the file system level or at the application level.
5. Since the optimized cache technology is already included in most databases, database I/O should be accessed directly from the original disk partition (raw partition) or with DIO technology bypassing the file system cache (direct IO)
6. Use the memory read-write bandwidth far more than the direct disk I/O operation performance characteristics, the frequently accessed files or data put into memory.

Disk I/O performance monitoring metrics and tuning methods

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.