[Turn]linux system Monitoring, diagnostic tool IO wait

Source: Internet
Author: User

1. Questions:

Recently in the real-time synchronization of the log, before the online is done a single part of the online log stress test, Message Queuing and client, the machine is no problem, but did not think of the second log, the problem came:

A machine in the cluster top see the load high, the cluster of machine hardware configuration, the deployment of software are the same, but only this one load problem, the initial speculation that there may be hardware problems.

At the same time, we also need to pull out the culprit of the abnormal load, and then find the solution from the software and hardware level respectively.

2. Troubleshooting:

From top you can see that the load average is high, the%wa is high, and the%us is low:

Since we can roughly infer that IO has encountered bottlenecks, we can then use the relevant IO diagnostic tools, specific verification and troubleshooting.

PS: If you are not familiar with the usage of top, please refer to a blog post I wrote last year:

Linux system monitoring, diagnostic tools Top

There are several types of common combinations:

    • Detection of CPU bottlenecks with Vmstat, SAR, Iostat

    • Use free, vmstat to detect if it is a memory bottleneck

    • Detection of disk I/O bottlenecks with Iostat and DMESG

    • Detection of network bandwidth bottlenecks with Netstat

2.1 Vmstat

The meaning of the Vmstat command is to display the virtual memory status ("Viryual memor Statics"), but it can report on the overall operational state of the system, such as process, memory, I/O, etc.


Its related fields are described below:

procs (process)? R: The number of processes in the running queue, which can also be used to determine if the CPU needs to be increased. (longer than 1)? B: The number of processes waiting for IO, that is, the number of processes in non-disruptive sleep state, showing the number of tasks that are executing and waiting for CPU resources. When this value exceeds the number of CPUs, the CPU bottleneck will occur memory (RAM)? SWPD: Using virtual memory size, if the value of SWPD is not 0, but the value of Si,so is 0 long, this situation does not affect system performance.?  Free: Free physical memory size. Buff: The amount of memory used as a buffer. Cache: Used as the memory size for caching, if the cache value is large, it indicates that there are many files in the cache, and if the frequently accessed files can be cached, the disk's read IO bi will be very small. Swap? Si: Writes from the swap area to the memory size per second, and the disk is transferred into memory. So: The amount of memory written to the swap area per second is transferred from memory to disk. Note: When memory is sufficient, these 2 values are 0, and if these 2 values are longer than 0 o'clock, system performance will be affected and both disk IO and CPU resources will be consumed. Some friends see free memory ( Freefew or close to 0 o'clock, think that the memory is not enough to see this point, but also to combine SI and so, if the free is very few, but Si and so is also very few (most of the time is 0), then do not worry, the system performance will not be affected. IO (now the size of the Linux version block is 1kb)? BI: The number of blocks read per second? Bo: Number of blocks written per second note: When the random disk reads and writes, these 2 values are larger (for example, exceeding 1024k), and you can see that the CPU is waiting for the value of the IO more. System (Systems)? inch: Number of interrupts per second, including clock interrupts. CS: The number of context switches per second. Note: The larger the 2 values above, the greater the CPU time that is consumed by the kernel. CPU (expressed as a percentage)? US: Percentage of user Process Execution time Timewhen the value of us is higher, it indicates that the user process consumes more CPU time, but if the long%, then we should consider optimizing the program algorithm or accelerating it. Sy: Percentage of kernel system Process Execution time (System Timewhen the value of SY is high, it indicates that the system kernel consumes more CPU resources, which is not a benign performance, we should check the cause. Wa:io wait time percentage when the value of WA is high, the IO Wait is more serious, which may be caused by a large number of random accesses to the disk, or the disk bottleneck (block operation). ID: Percentage of idle time

As can be seen from the Vmstat, the CPU spends most of its time waiting for Io, possibly due to a large number of random disk access or disk bandwidth caused by, Bi, Bo also more than 1024k, should have encountered an IO bottleneck.

2.2 Iostat

The following is a more professional disk IO diagnostic tool to see the relevant statistics.


Its related fields are described below:

RRQM/S: The number of read operations per second for the merge. Delta (rmerge)/SWRQM/s: The number of write operations per second for the merge. Delta (wmerge)/SR/s: Number of Read I/O devices completed per second. That is Delta (RIO)/sW/s: Number of write I/O devices completed per second. Delta (WIO)/srsec/s: Number of Read sectors per second. Delta (rsect)/swsec/s: Number of Write sectors per second. Delta (wsect)/Srkb/s: Read K bytes per second. It's rsect/.half of S, because the size of each sector is 512 bytes. (calculation required) WkB/s: Write K bytes per second. It's wsect/.half of S. (calculation required) AVGRQ-sz: The average data size (sector) per device I/O operation. Delta (rsect+wsect)/delta (rio+wio) Avgqu-sz: Average I/O queue length. Delta (AVEQ)/s/ +(because Aveq units are milliseconds). Await: Average Per Device IWait time (in milliseconds) for the/O operation. Delta (ruse+wuse)/delta (rio+wio) SVCTM: Average Per Device IThe service time, in milliseconds, of the/O operation. That is, Delta (use)/delta (rio+wio)%util: How much time in a second is spent on I/O operations, or how many times in a second I/O queues are non-empty. That is, Delta (use)/s/ +(because the use unit is in milliseconds)

You can see that the utilization of SDB in both drives is already 100%, there is a serious IO bottleneck, and the next step is to find out which process is reading and writing data to this hard drive.

2.3 Iotop

Based on the results of iotop, we quickly locate the problem of the flume process, resulting in a lot of IO wait.

But at the beginning I have said that the machine configuration in the cluster, the deployment of the program is the same as rsync, is the hard drive broken?

This has to find the operation of the students to verify, the final conclusion is:

SDB is a dual-disk RAID1, using the raid card as "LSI Logic/symbios Logic sas1068e" with no cache. Nearly 400 of the IOPS pressure has reached the hardware limit. The raid card used by other machines is "LSI Logic/symbios Logic megaraid SAS 1078", with 256MB cache, which does not meet the hardware bottleneck, the solution is to replace the machine that provides more IOPS, such as finally we changed a with PERC6 /I the machine that integrates the RAID controller card. It is necessary to note that the raid information is stored in a RAID card and disk firmware, the RAID information on the disk and the information format above the RAID card is matched, otherwise the raid card can not be recognized by the need to format the disk.
ioPS essentially depends on the disk itself, but there are many ways to increase the IOPS, plus the hardware cache and RAID arrays are common methods. In the case of DB's high IOPS scenario, it is now popular to replace traditional mechanical hard drives with SSDs.
However, as we have said before, we are looking at the possibility of finding a solution that is the least expensive in terms of both hardware and software:

Knowing the reason for the hardware, we can try to move the read and write operations to another disk, and then look at the effect:

3, the last words: a way

In fact, in addition to using the above professional tools to locate the problem, we can directly use the process state to find the relevant process.

We know that the process has the following States:

Sleepsleepfor2.6. XX kernel) X dead (should never be seen) Z Defunct ("zombie") process, terminated but not reaped by its parent.

Where the status is D is usually caused by the wait IO to cause so-called "non-disruptive sleep", we can start from this point and then step by bit to locate the problem:

 forXinch`seq Ten`; Do PS-eo State,pid,cmd |grep "^d";Echo "----";Sleep 5; DoneD248[jbd2/dm-0-8] D16528bonnie++-N0-U0-R239-S478-f-b-D/tmp----D A[Kdmflush] D16528bonnie++-N0-U0-R239-S478-f-b-D/tmp----# or: while true; Do Date;PSAUXF |awk '{if ($8== "D") print $;}';Sleep 1; DoneTue at  -:Geneva: WuClt .Root302  0.0  0.0      0     0? D May222: -\_ [Kdmflush] Root321  0.0  0.0      0     0? D May224: One\_ [jbd2/dm-0-8] Tue at  -:Geneva: -Clt .Tue at  -:Geneva: AboutClt .Cat/proc/16528/io Rchar:48752567WCHAR:549961789SYSCR:5967SYSCW:67138read_bytes:49020928write_bytes:549961728cancelled_write_bytes:0lsof-P16528COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME Bonnie++16528Root CWD DIR252,0 4096 130597/tmp<truncated>Bonnie++16528Root8uReg252,0 501219328 131869/tmp/bonnie.16528Bonnie++16528Root9uReg252,0 501219328 131869/tmp/bonnie.16528Bonnie++16528Root10uReg252,0 501219328 131869/tmp/bonnie.16528Bonnie++16528Root11uReg252,0 501219328 131869/tmp/bonnie.16528Bonnie++16528Root12uReg252,0 501219328 131869<strong>/tmp/bonnie.16528</strong>DF/tmp Filesystem 1K-blocks used Available use%mounted on/dev/mapper/workstation-root7667140 2628608 4653920 Panax Notoginseng% /Fuser-VM/tmp USER PID ACCESS COMMAND/tmp:db2fenc11067..... m DB2FMP db2fenc11071..... m DB2FMP db2fenc12560..... m DB2FMP db2fenc15221..... m DB2FMP
4, Refer:

[1] Troubleshooting high I/O Wait in Linux
--a Walkthrough on what to find processes that is causing high I/O Wait on Linux Systems
http://bencane.com/2012/08/06/troubleshooting-high-io-wait-in-linux/

[2] Understanding Linux system Load

Http://www.ruanyifeng.com/blog/2011/07/linux_load_average_explained.html

[3] Iostat, Vmstat and Mpstat Examples for Linux performance monitoring

http://www.thegeekstuff.com/2011/07/iostat-vmstat-mpstat-examples/

[4] vmstat vmstat command
Http://man.linuxde.net/vmstat

[5] Linux vmstat Command real-combat detailed

Http://www.cnblogs.com/ggjucheng/archive/2012/01/05/2312625.html
[6] Factors affecting the performance of Linux servers

Http://www.rocklv.net/2004/news/article_284.html

[7] Linux disk IO view Iostat,vmstat

http://blog.csdn.net/qiudakun/article/details/4699587

[8] What Process is using the all of my disk IO

Http://stackoverflow.com/questions/488826/what-process-is-using-all-of-my-disk-io

[9] Linux Wait IO problem

Http://www.chileoffshore.com/en/interesting-articles/126-linux-wait-io-problem

[Tracking]-down-high IO-Wait in Linux

Http://ostatic.com/blog/tracking-down-high-io-wait-in-linux

[11] Disk IOPS calculation and measurement

http://blog.csdn.net/liuaigui/article/details/6168186

[DOC] Disk performance indicator-iops-huawei

Http://www.huawei.com/ecommunity/3msimage/download-10053641-10023111-fbcd1a056196d26a1a30032a222a5ec3.bin?type=bbs

[] RAID Card

Http://baike.baidu.com/view/95439.htm

Some I/O statistics tools under Linux

Http://blogread.cn/it/article/5716?f=wb

[15] Disposable optimization, TPS from 400+ to 4k+

Http://bit.ly/29WaL5F

Transferred from: https://my.oschina.net/leejun2005/blog/355915

[Turn]linux system Monitoring, diagnostic tool IO wait

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.