Vmstat and Iostat are important two performance monitoring tools for Linux.
Vmstat-Brief information on memory, processes, and paging
Procs
R indicates how many processes are waiting for the CPU
b indicates how many processes are hibernating (usually means waiting for I/O such as disk, network, user input, etc.)
Memory
SWPD how many blocks are swapped out to disk
Free How many pieces are idle
Buff How many blocks are being used as buffers
How many blocks of cache are being used as the operating system's cache
Swap (page exchange activity)
How many pieces of Si are being swapped from disk
So how many blocks are being swapped out to disk
Io
How many pieces of bi are read from a block disk device
Bo How many blocks are written from a block disk device
System
In number of interrupts per second
CS number of context switches per second
CPU (% of all CPU time spent on all types of operations)
US executes user code (not kernel)
Sy Execution System Code (kernel)
ID Idle
WA and other land ah io
St (Time given to the other domu instances)
IOSTAT-CPU statistics, device, and partition I/O statistics.
AVG-CPU represents the average CPU usage time ratio in various modes, not explained in detail
Look at the device I/O information:
TPS the number of times the device transmits per second
KB_READ/S The amount of data read from the device per second;
KB_WRTN/S The amount of data written to the device per second;
The total amount of data read by Kb_read;
Total amount of data written by KB_WRTN; these units are kilobytes.
The following is a netizen to share
Linux disk IO view (iostat)
##############
#
# operation
#
##############
# iostat-x 1 10
Linux 2.6.18-92.el5xen 02/03/2009
AVG-CPU:%user%nice%system%iowait%steal%idle
1.10 0.00 4.82 39.54 0.07 54.46
device:rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await SVCTM%util
SDA 0.00 3.50 0.40 2.50 5.60 48.00 18.48 0.00 0.97 0.97 0.28
SDB 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
SDC 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
SDD 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
SDE 0.00 0.10 0.30 0.20 2.40 2.40 9.60 0.00 1.60 1.60 0.08
SDF 17.40 0.50 102.00 0.20 12095.20 5.60 118.40 0.70 6.81 2.09 21.36
SDG 232.40 1.90 379.70 0.50 76451.20 19.20 201.13 4.94 13.78 2.45 93.16
##############
#
# comments
#
##############
RRQM/S: Number of read operations per second for merge. Delta (rmerge)/s
wrqm/s: Number of write operations per second for merge. Delta (wmerge)/s
R/S: Number of Read I/O devices completed per second. Delta (RIO)/s
W/S: Number of write I/O devices completed per second. Delta (WIO)/s
RSEC/S: Number of sectors read per second. Delta (rsect)/s
WSEC/S: Number of sector writes per second. Delta (wsect)/s
RKB/S: The number of K bytes read per second. Is half the rsect/s, because each sector size is 512 bytes. (Need to calculate)
WKB/S: The number of K bytes written per second. is half the wsect/s. (Need to calculate)
Avgrq-sz: The average data size (sector) per device I/O operation. Delta (rsect+wsect)/delta (Rio+wio)
Avgqu-sz: Average I/O queue length. The Delta (AVEQ)/s/1000 (because the Aveq unit is in milliseconds).
Await: The average wait time (in milliseconds) for each device I/O operation. Delta (ruse+wuse)/delta (Rio+wio)
SVCTM: Average service time (in milliseconds) per device I/O operation. Delta (use)/delta (RIO+WIO)
%util: How much time is spent in a second for I/O operations, or how many times in a second I/O queues are non-empty. The delta (use)/s/1000 (because the unit of use is in milliseconds)
##############
#
# Analysis
#
##############
1. If%util is close to 100%, there are too many I/O requests generated, i/ o The system is full load and there may be a bottleneck on the disk.
2. If the idle is smaller than the 70% io pressure is larger, the general read speed has more wait.
3. At the same time, you can combine vmstat to view the view of the B parameter (the number of processes waiting for the resource) and the WA parameter (percentage of CPU time occupied by IO wait, high io pressure above 30%)
4. Alternatively, you can also refer to the
SVCTM generally less than await ( Because the waiting time for waiting requests is calculated repeatedly, the size of the SVCTM is generally related to disk performance, and the load on the cpu/memory will have an effect on it, too much of the request will indirectly lead to SVCTM increase. The size of the await generally depends on the service time (SVCTM) and the length of the I/O queue and the emit mode of I/O requests. If the SVCTM is closer to await, the I/O has almost no wait time, if await is much larger than SVCTM, the I/O queue is too long, the response time is slow, and if the response time exceeds the allowable range of users, then you can consider replacing faster disks and adjusting the kernel Elev Ator algorithm, optimize application, or upgrade CPU.
Queue Length (AVGQU-SZ) can also be used as an indicator of system I/O load, but since Avgqu-sz is average per unit time, it does not reflect instantaneous I/O floods.
###############
#
# example Explanation
#
###############
A good example of others. (I/O system vs. supermarket queues)
For example, when we queue checkout in a supermarket, how do we decide which payment table to go to? The first line is the number of teams, 5 people are always faster than 20 people? In addition to counting heads, we often look at how much people buy in front of us, if there is an aunt who has purchased food for a week, then consider a different line. There is the speed of the cashier, if met even the money is not clear to the novice, then some wait. In addition, timing is also important, maybe 5 minutes ago, the cash desk, which is still overcrowded, is now empty, but the payment is very good, of course, the premise is that the past 5 minutes to do things than the line to make sense (but I have not found anything than the queue is boring).
The I/O system also has many similarities with the supermarket queues:
r/s+w/s similar to the total number of people who paid
Average Queue Length (AVGQU-SZ) similar to the number of people queuing in a unit of time
Average service time (SVCTM) similar to Cashier's collection speed
Average wait time (await) is similar to the average waiting time per person
Average I/O data (AVGRQ-SZ) is similar to the average number of things each person buys
The I/O operating rate (%util) is similar to the percentage of time someone queues up at the cashier.
Based on this data, we can analyze the pattern of I/O requests, as well as the speed and response time of I/O.
###############
#
# Case Study
#
###############
The following is written by someone else. Analysis of the output of this parameter
# iostat-x 1
AVG-CPU:%user%nice%sys%idle
16.2 0.00 4.31 79.44
device:rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkb/s wkb/s avgrq-sz avgqu-sz await SVCTM%util
/DEV/CCISS/C0D0 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29
/DEV/CCISS/C0D0P1 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29
/DEV/CCISS/C0D0P2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
The above Iostat output indicates that there are 28.57 device I/O operations in seconds: Total io (IO)/s = r/s (read) +w/s (write) = 1.02+27.55 = 28.57 (sec/s) where write operations occupy the main body (w:r = 27:1).
The average per-device I/O operation takes only 5ms to complete, but each I/O request needs to wait 78ms. Since there are too many I/O requests issued (about 29 per second), assuming that these requests are issued concurrently, the average wait time can be computed as follows:
Average wait time = single I/O service time * (1 + 2 + ... + total requests-1)/Total requests
The example applied to the above: the average wait time = 5ms * (1+2+...+28)/29 = 70ms, and the 78ms average wait time given by Iostat is very close. This, in turn, indicates that I/O is initiated concurrently.
There are many I/O requests per second (about 29), and the average queue is not long (only about 2), indicating that the arrival of these 29 requests is uneven and most of the time I/O is idle.
14.29% of the time in a second I/O queue is requested, that is to say, 85.71% of the time I/O system has nothing to do, all 29 I/O requests are processed within 142 milliseconds.
Delta (ruse+wuse)/delta (IO) = await = 78.21 => Delta (ruse+wuse)/s =78.21 * Delta (IO)//= 78.21*28.57 = 2232.8, indicating I/O per second Ask for a total of 2232.8ms. So the average queue length should be 2232.8ms/1000ms = 2.23, and iostat the average queue Length (AVGQU-SZ) is 22.35, why?! Because the Bug,avgqu-sz value in the Iostat should be 2.23, not 22.35.
Vmstat Monitoring of memory usage
Vmstat is the abbreviation for Virtual Meomorystatistics, which monitors virtual memory, processes, and CPU activity on the operating system. It is the overall situation of the system statistics, the disadvantage is that a process can not be in-depth analysis.
The syntax for Vmstat is as follows:
Vmstat [-v] [-n] [delay [count]]
-V indicates that the version information is printed;
-n indicates that the header information for the output is displayed only once during periodic cyclic output;
Delay is the delay time between two outputs;
Count refers to the number of times this interval is counted.
[Root@yufei ~]# Vmstat
procs-----------Memory-------------Swap-------io------System-------CPU-----
R b swpd free buff cache si so bi bo in CS us sy ID WA St
0 0 4116 38560 24456 90224 5 11 595 71 96 134 2 2 88 8-0
Result parameter Description:
Procs
R: Number of tasks waiting to be performed
B: Number of processes in a non-interrupted sleep state
Shows the number of tasks that are executing and waiting for CPU resources. When this value exceeds the number of CPUs, there is a CPU bottleneck.
Memory
SWPD: Swap size unit k in use
Free: Idle memory space
Buff: Buff size used, buffering read and write to block devices
Cache: Cache size already used, file system cache
Inact:
Active
Swap
Si: Swap memory usage, disk transfer into memory
So: Swap memory usage, disk transfer from memory
Io
BI: Total amount of data read from block device (read disk) (KB/S)
Bo: Data written to the block device prime (write disk) (KB/S)
System
In: Number of interrupts generated per second
CS: The number of context switches generated per second
The larger the above 2 values, the more CPU time will be consumed by the kernel
Cpu
US: Percentage of CPU time consumed by the user process
The value of us is higher, which means that the user process consumes more CPU time, but if more than 50% is used over a long period, then we should consider the optimizer algorithm or accelerate the
SY:% CPU time consumed by kernel process
Sy's high value, indicating that the system kernel consumes more CPU resources, this is not benign performance, we should check the reason.
ID: Idle
Percentage of CPU time that Wa:io waits to consume
When WA has a high value, the IO wait is more serious, possibly due to a large number of random accesses to the disk or a bottleneck in the bandwidth of the disk (block operation)