Partial reprint 51942281
The iostat command is used to monitor system input and CPU usage. It is characterized by the reporting of disk activity statistics, as well as the reporting of CPU usage. like Vmstat, Iostat also has a weakness, that is, it cannot analyze a process in depth, only the overall situation of the system.
Iostat Monitoring I/O subsystem
The IOSTAT is an abbreviation for I/O statistics (input/output statistics) that is used to dynamically monitor the system's disk operation activity.
1. Command format
iostat[parameters [TIME] [number]
2. Command function
Through the Iostat convenient to view the CPU, network card, TTY device, disk, CD-ROM and so on the activity of equipment, load information.
3. Command parameters
- -C Display of CPU usage
- -D Display disk usage
- -K displays in kilobytes
- -m displays in units of M.
- -N Display of disk array (LVM) information
- -N Display of NFS usage
- -p[disk] Displays disk and partition conditions
- -t display terminal and CPU information
- -X to display detailed information
- -V Display version information
4. Example tool Example 1: Show all device load conditions
/root$iostatlinux 2.6.32-279.el6.x86_64 (Colin) 07/16/2014 _x86_64_ (4 CPU) Avg-cpu: %user % Nice%system%iowait %steal %idle10.81 0.00 14.11 0.18 0.00 74.90Device: TPS blk_read/s BLK_WRTN/S blk_read BLK_WRTNSDA 1.95 1.48 70.88 9145160 437100644dm-0 3.08 0.55 24.34 3392770 150087080dm-1 5.83 0.93 46.49 5714522 286724168dm-2 0.01 0.00 0.05 23930 289288
-
Description of the CPU attribute value:
-
- %user:cpu the percentage of time in user mode.
- %nice:cpu the percentage of time in user mode with nice value.
- %system:cpu the percentage of time in system mode.
- %iowait:cpu the percentage of time to wait for the input output to finish.
- %steal: The percentage of the virtual CPU's unconscious wait time when the hypervisor maintains another virtualized processor.
- %IDLE:CPU idle time percentage.
Note: If the value of%iowait is too high, indicates that there is an I/O bottleneck on the hard disk, the%idle value is high, the CPU is idle, if the%idle value is high but the system responds slowly, it is possible that the CPU waits for the memory to be allocated. You should increase your memory capacity at this time. If the%idle value continues below 10, the system's CPU processing power is relatively low, indicating that the most resource to be addressed in the system is the CPU.
-
Disk Property value Description:
-
- rrqm/s: Number of read operations per second for merge. That is rmerge/s
- wrqm/s: The number of write operations per second for the merge. That is, wmerge/s
- r/s: Number of Read I/O devices completed per second. That is rio/s
- w/s: Number of write I/O devices completed per second. That is wio/s
- rsec/s: Number of Read sectors per second. That is rsect/s
- wsec/s: Number of Write sectors per second. That is wsect/s
- rkb/s: reads K bytes per second. is half the rsect/s because the size of each sector is 512 bytes.
- wkb/s: Writes the number of K bytes per second. is half the wsect/s.
- Avgrq-sz: The average data size (sector) for each device I/O operation.
- avgqu-sz: Average I/O queue length.
- await: The average wait time (in milliseconds) per device I/O operation.
- SVCTM: The average service time (in milliseconds) per device I/O operation.
- %util: How much time in a second is spent on I/O operations, which is the percentage of CPU consumed by IO
Note: If%util is close to 100%, it indicates that there are too many I/O requests and that the I/O system is full, the disk may have bottlenecks. If the SVCTM is closer to await, it indicates that I/O has almost no wait time, and if the await is much larger than SVCTM, indicating that the I/O queue is too long and the IO response is too slow, the necessary optimizations are required. If the Avgqu-sz is larger, it also indicates that there is an equivalent IO waiting.
Example 2: Timed display of all information
/root$iostat 2 3Linux 2.6.32-279.el6.x86_64 (Colin) 07/16/2014 _x86_64_ (4 CPU) Avg-cpu:%user%nice%syst EM%iowait%steal%idle10.81 0.00 14.11 0.18 0.00 74.90device:tps blk_read/s BLK_WRTN/S Blk_read BLK_WRTNSDA 1.95 1.48 70.88 9145160 437106156dm-0 3.08 0. 24.34 3392770 150088376dm-1 5.83 0.93 46.49 5714522 286728384dm-2 0.01 0.00 0.05 23930 289288avg-cpu:%user%nice%system%iowait%steal%idle22.62 0.0 0 19.67 0.26 0.00 57.46device:tps blk_read/s blk_wrtn/s blk_read BLK_WRTNSDA 2 .0.00 28.00 0 56dm-0 0.00 0.00 0.00 0 0d M-1 3.50 0.00 28.00 0 56dm-2 0.00 0.00 0.00 0 0AVG-CPU:%user%nice%system%iowait%steal%idle22.69 0.00 19.62 0.00 0.00 57.69device:tps Blk_ read/s blk_wrtn/s blk_read BLK_WRTNSDA 0.00 0.00 0.00 0 0dm-0 0.00 0.00 0.00 0 0dm-1 0.00 0.00 0.00 0 0dm-2 0.00 0.00 0.00 0 0
Note: Refresh the display every 2 seconds and display 3 times
Example 3: viewing TPS and throughput
/root$iostat-d-K 1 1Linux 2.6.32-279.el6.x86_64 (Colin) 07/16/2014 _x86_64_ (4 CPU) Device: TPs kb_read/s kb_wrtn/s kb_read KB_WRTNSDA 1.95 0.74 35.44 4572712 218559410dm-0 3.08 0.28 12.17 1696513 75045968dm-1 5.83 0.46 23.25 2857265 143368744dm-2 0.01 0.00 0.02 11965 144644
- TPS: The number of transmissions per second of the device (indicate, transfers per second, were issued to the.). "One-time transfer" means "one-time I/O request". Multiple logical requests may be merged into "one I/O request". The size of the "one transfer" request is unknown.
- KB_READ/S: The amount of data read from the device (drive expressed) per second;
- KB_WRTN/S: The amount of data written to the device (drive expressed) per second;
- Kb_read: The total amount of data read; Kb_wrtn: The amount of total data written;
These units are kilobytes.
In the example above, we can see statistics on the disk SDA and its partitions, when the total disk TPS for statistics is 1.95, and the following is the TPS for each partition. (because it is an instantaneous value, the total TPS is not strictly equal to the sum of each partition TPs)
Example 4: View device usage (%util) and response time (await)---disk performance statistics
/root$iostat-d-x-k 1 1Linux 2.6.32-279.el6.x86_64 (Colin) 07/16/2014 _x86_64_ (4 CPU) Device: rrqm/s wrqm/s r/s w/s rkb/s wkb/s avgrq-sz avgqu-sz await SVCTM %UTILSDA 0.02 7.25 0.04 1.90 0.74 35.47 37.15 0.04 19.13 5.58 1.09dm-0 0.00 0.00 0.04 3.05 0.28 12.18 8.07 0.65 209.01 1.11 0.34dm-1 0.00 0.00 0.02 5.82 0.46 23.26 8.13 0.43 74.33 1.30 0.76dm-2 0.00 0.00 0.00 0.01 0.00 0.02 8.00 0.00 5.41 3.28 0.00
- RRQM/S: The number of read operations per second for the merge. That is Delta (rmerge)/s
- WRQM/S: The number of write operations per second for the merge. That is, Delta (wmerge)/s
- R/S: Number of Read I/O devices completed per second. Delta (RIO)/s
- W/S: Number of write I/O devices completed per second. Delta (WIO)/s
- RSEC/S: Number of Read sectors per second. Delta (rsect)/s
- WSEC/S: Number of Write sectors per second. Delta (wsect)/s
- rkb/s: Read K bytes per second. Is half of the rsect/s because the size of each sector is 512 bytes. (Calculation required)
- wkb/s: Writes K bytes per second. It's half the wsect/s. (Calculation required)
- AVGRQ-SZ: The data size (sector) of the average Per device I/O operation. Delta (Rsect+wsect)/delta (Rio+wio)
- Avgqu-sz: Average I/O queue length. That is Delta (AVEQ)/s/1000 (because Aveq is in milliseconds).
- Await: The average wait time (in milliseconds) for each device I/O operation. Delta (Ruse+wuse)/delta (Rio+wio)
- SVCTM: The average service time (in milliseconds) per device I/O operation. Delta (use)/delta (RIO+WIO)
- %util: How much time per second is spent on I/O operations, or how much time in a second I/O queues are non-empty, that is, Delta (use)/s/1000 (because the use is in milliseconds)
If%util is close to 100%, it indicates that there are too many I/O requests, the I/O system is full, and the disk may have bottlenecks. Idle less than 70% io pressure is larger, the general reading speed has more wait. You can also combine vmstat to see the b parameter (the number of processes waiting on the resource) and the WA parameter (the percentage of CPU time that IO waits, which is higher than 30% when the IO pressure is high).
In addition to await the parameters are more and SVCTM to reference. If the difference is too high, there must be an IO problem.
Avgqu-sz is also an IO tuning need to pay attention to, this is the direct operation of the size of the data, if the number of times, but the small amount of data, in fact, IO will be very small. If the data is large, the IO data will be high. You can also pass AVGQU-SZX (r/s or w/s) = RSEC/S or wsec/s. That is to say, the speed of reading is determined by this.
SVCTM generally less than await (because the waiting time for waiting requests is repeatedly computed), the size of SVCTM is generally related to disk performance, cpu/memory load will have an impact on it, too many requests will indirectly lead to increased SVCTM. The size of an await typically depends on the service time (SVCTM) and the length of the I/O queue and the emit mode of the I/O request. If the SVCTM is closer to await, stating that I/O has almost no waiting time, if the await is much larger than SVCTM, the I/O queue is too long, the response time of the application gets slower, and if the response time exceeds the allowable range of the user, consider replacing the faster disk and adjusting the kernel Elev Ator algorithm, optimize the application, or upgrade the CPU.
The queue Length (AVGQU-SZ) can also be used as an indicator for measuring the system I/O load, but because Avgqu-sz is averaged over a unit time, it does not reflect instantaneous I/O flooding.
-
Image of Metaphor:
-
- r/s+w/s similar to the total number of people who have been
- Average Queue Length (AVGQU-SZ) is similar to the number of average queueing people in a unit time
- Average service time (SVCTM) is similar to the cashier's payment speed
- Average wait time (await) is similar to the average wait time per person
- Average I/O data (AVGRQ-SZ) is similar to the average number of things each person buys
- I/O operation rate (%util) is similar to the time scale at which a person is queued at the cashier
Device IO operation: total io (IO)/s = r/s (read) +w/s (write)
Average wait time = single I/O server time * (total number of 1+2+...+ requests-1)/Total requests
There are many i/0 requests per second, but the average queue is 4, indicating that these requests are more uniform and most of the processing is more timely.
[linux]iostat command details-monitor system input and CPU usage