In [virtualization practice] Five iops of the storage design, we talked about three key indicators for evaluating the storage performance. That is, throughput, iops, and latency. And the relationship between the three. This article provides an in-depth description of the causes and suggestions for high latency.
High latency directly reduces the performance of virtual machines and applications running on the storage. The user may complain that the program cannot be opened, the execution is slow, and the response time is long.
1. How to measure latency?
Latency or respondingtime refers to the time required to complete an IO request. It is often measured by milliseconds.
An I/O request sent by the application must pass through the following layers to finally reach the storage device.
Use esxtop to get the following data
Column |
Description |
Cmds/s |
In most cases, this value is the value of iops. It refers to the IO request sent every second. |
Davg/CMD (Device average latency) |
The average response time required for each request to pass through the physical hardware, HbA, and storage device. In milliseconds. Generally 20-30 ms is acceptable. |
Kavg/CMD (Kernel average latency) |
Average time required for processing each request through the vmkernel layer. It is generally 0. If it exceeds 2 ms, the performance may be affected. |
Qavg (Queue average latency) |
Average time required for each request to pass through the vsphere storage stack. When the queue is very long, each request waits for a long time. |
Gavg/CMD (Guest average latency) |
The average response time for each request is finally obtained, that is, the value obtained by the virtual machine operating system. Davg + kavg = gavg Generally, 20-30 ms is acceptable. This is required for applications with very high latency sensitive to be as low as possible. For example, some important database operations cannot be completed successfully after 5 ms. |
2. Cause Analysis of high latency:
Storage design cannot meet requirementsFor more information, see my previous article TBD.
A common misunderstanding is that only the required capacity is taken into account, and iops, latency, throughput, and other factors that affect performance are not fully taken into account. For example, an application requires a capacity of 10 TB and may need to purchase 20 TB or even many other storage to meet performance requirements. A reasonable solution and details should be fully discussed with the storage vendor. For example, what raid is used by the hacker, the number of diskspindle in the array, and what type of storage hard disk is used.
Fully consider the applications supported by the storage. Many applications have different features, such as I/O size and read/write ratio. Appropriate storage solutions should be designed based on their features.
There are many tools to collect and analyze data and stress testing to help you understand the current storage capabilities. For example, VMWare I/O analyzer, iometer, loginvsi, and solarwinds
I/O queue congestion
From the top to bottom layers, there are queues. The longer the tasks waiting for execution in the queue, the longer the response time.
The queue on the esxi host layer is too long, which causes the kavg value to be too high.
The queue in the HBA and storage array is too long, causing the davg value to be too high
Taking the esxi host layer as an example, lunqueue depth determines the number of activecommands that can be initiated on a Lun at the same time. The default value of esxi is 32. It is recommended that the total number of activecommands initiated by all virtual machines not exceed lunqueue depth. Although lunqueue depth can be added to a maximum of 64, we recommend that you use the default value.
For example, when there are multiple I/O intensive virtual machines in the same Lun, you need to consider transferring some of them to other virtual machines to avoid the delay caused by the total number of activecommands continuously exceeding the lunqueue depth.
The HbA layer also has queues, usually 4,000 commandsper port or higher. Therefore, the general bottleneck is not at the HBA layer.
Storage bandwidth saturation
Consider the bandwidth supported by the HBA card and the multi-path routing of the ingress to distribute the load. Avoid high average response times required by physical hardware, HbA, and storage devices.
Exam:
Performancelinks
Vmug presentation-Troubleshooting storageperformance
Troubleshootingstorage performance in vsphere-Part 1-the basics
Http://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf