The vmstat command is the most common Linux/Unix monitoring tool that displays the status values of servers at a given interval, including CPU usage, memory usage, and virtual memory switching, IO read/write status. This command is my favorite Linux/Unix command. one is supported by Linux/Unix, and the other is the top command. I can see the CPU and memory of the entire machine, i/O usage, rather than simply viewing the CPU usage and memory usage of each process (different use cases ).
The vmstat command is the most common Linux/Unix monitoring tool that displays the status values of servers at a given interval, including CPU usage, memory usage, and virtual memory switching, IO read/write status. This command is my favorite Linux/Unix command. one is supported by Linux/Unix, and the other is the top command. I can see the CPU and memory of the entire machine, i/O usage, rather than simply viewing the CPU usage and memory usage of each process (different use cases ).
Generally, the vmstat tool is used by two numeric parameters. The first parameter is the number of sampling intervals, in seconds, and the second parameter is the number of sampling times, for example:
root@ubuntu:~# vmstat 2 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 0 3498472 315836 3819540 0 0 0 1 2 0 0 0 100 0
2 indicates that the server status is collected every two seconds, and 1 indicates that the server status is collected only once.
In fact, during the application process, we will continue to monitor for a period of time, instead of directly stopping vmstat, for example:
root@ubuntu:~# vmstat 2 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 0 3499840 315836 3819660 0 0 0 1 2 0 0 0 100 0 0 0 0 3499584 315836 3819660 0 0 0 0 88 158 0 0 100 0 0 0 0 3499708 315836 3819660 0 0 0 2 86 162 0 0 100 0 0 0 0 3499708 315836 3819660 0 0 0 10 81 151 0 0 100 0 1 0 0 3499732 315836 3819660 0 0 0 2 83 154 0 0 100 0
This means that vmstat collects data every 2 seconds and keeps collecting data until I end the program. I have collected five data times and ended the program.
Now, the introduction of the command is complete. we will explain the meaning of each parameter in practice.
RIndicates the running queue (that is, how many processes are actually allocated to the CPU). the CPU of the server I tested is currently idle and no program is running. when this value exceeds the number of CPUs, the CPU bottleneck may occur. This is also related to the top load. Generally, when the load exceeds 3, it is relatively high. if the load exceeds 5, it is high. if the load exceeds 10, it is abnormal and the server status is very dangerous. The top load is similar to the running queue per second. If the running queue is too large, it indicates that your CPU is very busy, which may cause high CPU usage.
BIndicates a BLOCKED process. this is not to mention that the process is blocked. you can understand it.
SwpdThe size of the virtual memory used. if it is greater than 0, the physical memory of your machine is insufficient. if it is not the cause of program memory leakage, you should upgrade the memory or migrate the memory-consuming tasks to other machines.
FreeThe size of idle physical memory. The total memory of my machine is 8 GB, and the remaining memory is 3415 MB.
BuffThe Linux/Unix system is used to store the contents and permissions in the directory. the local machine occupies more than 300 MB.
CacheCache is used directly to remember the files we opened and buffer the files. my local machine occupies more than 300 MB (here is the cleverness of Linux/Unix, cache part of the idle physical memory for files and directories to improve the performance of program execution. when the program uses the memory, buffer/cached will be quickly used .)
SiThe size of the virtual memory read from the disk per second. if the value is greater than 0, it indicates that the physical memory is insufficient or the memory is leaked. Find out the memory-consuming process to solve the problem. My machine has plenty of memory and everything is normal.
SoThe size of the virtual memory written to the disk per second. if the value is greater than 0, the same as above.
BiThe number of blocks received by block devices per second. the block devices here refer to all disks and other block devices in the system. the default block size is 1024 bytes. I have no IO operations on this machine, so it has always been 0, but I have seen it on a machine that processes a large amount of data (2-3 TB) that can reach 140000/s, and the disk write speed is almost MB per second.
BoThe number of blocks sent by the block device per second. for example, if we read files, the number of bo messages must be greater than 0. Bi and bo are generally close to 0. otherwise, IO is too frequent and needs to be adjusted.
InThe number of CPU interruptions per second, including time interruptions
CsThe number of context switches per second. for example, if we call a system function, we need to perform context switching, thread switching, and process context switching. the smaller the value, the better, the larger the value, we need to reduce the number of threads or processes. for example, on a web server such as apache and nginx, we generally perform thousands or even tens of thousands of concurrent tests during performance tests, the process of selecting the web server can be lowered from the process or thread peak until cs reaches a relatively small value. this process and the number of threads are a suitable value. The same is true for system calls. every time we call a system function, our code will enter the kernel space, resulting in context switching. this is resource-consuming and we should try to avoid frequent calls to system functions. Too many context switches indicate that most of your CPU is wasted on context switches, resulting in less time for proper CPU operations and insufficient CPU utilization.
UsCPU time of the user, I used to perform encryption and decryption on a very frequent server. we can see that us is close to 100, and r runs a queue of 80 (the machine is doing a stress test and the performance is not good ).
SyThe system CPU time. if it is too high, it indicates that the system call time is long, for example, frequent IO operations.
IdIdle CPU time. generally, id + us + sy = 100. generally, id indicates idle CPU usage, us indicates user CPU usage, and sy indicates system CPU usage.
WtWait for the io cpu time.