Track high IO waits in Linux systems

Last Update:2015-02-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first indication of a high IO wait problem is usually the system average load. Load balancing is calculated on the basis of CPU utilization, that is, the number of processes that are used or waiting for the CPU, and of course, on the Linux platform, almost all of the processes are in a non-disruptive state of sleep. The baseline for load balancing can be explained by the full utilization of the CPU on a CPU core machine. So, for a 4-core machine, if the average system complexity is 4, it means that the machine has enough resources to handle the work it needs to do, of course, just barely. On the same 4-core system, if the average complexity is 8, then it would mean that the server system needs 8 cores to handle the work, but now there are only 4 cores, so it's overloaded.

If the system shows a high average load, but the CPU's system and user (subscriber) utilization is low, then you need to observe the IO wait (that is, io wait). On the LINUC system, IO wait has a significant impact on the system load, primarily because one or more cores may be blocked by disk IO or network IO, and these core tasks (i.e. processes) can be carried out only after the completion of disk IO or network IO. These processes use PS aux to view all in the "D" state, which is the non-disruptive sleep state.

The discovery process is one thing to wait for IO to finish, and the reason for verifying high io Wait is another matter. Use "Iostat–x 1" to display the IO status of the physical storage device being used:

[[Email protected]~]$ iostat-x 1

device:rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await SVCTM%util

cciss/c0d0 0.08 5.94 1.28 2.75 17.34 69.52 21.60 0.11 26.82 4.12 1.66

CCISS/C0D0P1 0.00 0.00 0.00 0.00 0.00 0.00 5.30 0.00 8.76 5.98 0.00

CCISS/C0D0P2 0.00 0.00 0.00 0.00 0.00 0.00 58.45 0.00 7.79 3.21 0.00

CCISS/C0D0P3 0.08 5.94 1.28 2.75 17.34 69.52 21.60 0.11 26.82 4.12 1.66

From the above, it is obvious that the waiting time of the equipment/dev/cciss/c0d0p3 is very long. However, we did not mount a device, in fact, it is an LVM device. If you are using LVM as storage, then you should find that Iostat should be a bit confusing. LVM uses the device mapper subsystem to map file systems to physical devices, so iostat may display multiple devices, such as/dev/dm-0 and/dev/dm-1. The output of "df–h" does not show the device mapper path, but instead prints the LVM path. The simplest method is to add the option "-N" to the Iostat parameter.

[[Email protected]~]$ iostat-xn 1

device:rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await SVCTM%util

Vg1-root 0.00 0.00 0.09 3.01 0.85 24.08 8.05 0.08 24.69 1.79 0.55

Vg1-home 0.00 0.00 0.05 1.46 0.97 11.69 8.36 0.03 19.89 3.76 0.57

Vg1-opt 0.00 0.00 0.03 1.56 0.46 12.48 8.12 0.05 29.89 3.53 0.56

Vg1-tmp 0.00 0.00 0.00 0.06 0.00 0.45 8.00 0.00 24.85 4.90 0.03

VG1-USR 0.00 0.00 0.63 1.41 5.85 11.28 8.38 0.07 32.48 3.11 0.63

Vg1-var 0.00 0.00 0.55 1.19 9.21 9.54 10.74 0.04 24.10 4.24 0.74

VG1-SWAPLV 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 3.98 1.88 0.00

For simplicity, crop the output information for the Iostat command above. The IO waits shown by each file system listed are unacceptable, observing the data labeled "await" in column tenth. In contrast, the file system/USR has a higher await time. Let's start by analyzing this filesystem, using the command "fuser-vm/opt" to see which processes are accessing the file system, and the list of processes is as follows.

[Email protected]:/root > fuser-vm/opt

USER PID ACCESS COMMAND

/opt:db2fenc1 1067 .... m DB2FMP

Db2fenc1 1071 .... m DB2FMP

Db2fenc1 2560 .... m DB2FMP

Db2fenc1 5221 .... m DB2FMP

There are 112 DB2 processes on the current server that are accessing the/opt file system, listing four for brevity. It seems that the cause of the problem has been found, on the server, the database is configured to use faster San access, and the operating system can use a local disk. You can call and ask the DBA (database administrator) What to do to configure this.

The last group to note is LVM and device mapper. The output of the "IOSTAT–XN" command displays the logical volume name, but it can be traced to the mapping relationship table by the command "Ls–lrt/dev/mapper". The dm-in column sixth of the output information corresponds to the device name in Iostat.

Sometimes, there is nothing to do in the operating system or the application layer, except to choose a faster disk, and there is no other option. Fortunately, the price of fast disk access, such as a SAN or SSD, is gradually declining.

This article is from: Linux Learning Network

Track high IO waits in Linux systems

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Track high IO waits in Linux systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Track high IO waits in Linux systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support