Linux real-time monitoring commands

Source: Internet
Author: User
Tags disk usage high cpu usage

1. View disk IO iostat-x 1 10 View Device usage (%util), Response time (await)

AVG-CPU:%user%nice%system%iowait%steal%idle
27.13 0.00 21.90 3.71 0.00 47.26

device:rrqm/s wrqm/s r/s w/s rkb/s wkb/s avgrq-sz avgqu-sz await SVCTM%util
SDD 0.00 0.00 18.00 117.00 1524.00 12667.50 210.24 0.36 2.65 2.09 28.20
SDJ 0.00 0.00 15.00 209.00 536.00 14855.00 137.42 0.74 3.38 1.70 38.00

RRQM/S: How much of this device-dependent read request is merged per second (when the system call needs to read the data, the VFS sends the request to each FS, and if FS finds that different read requests read the same block data, FS merges the request into the merge); wrqm/s: How much of this device-related write request per second has been merge. Rsec/S: Number of sectors read per second; Wsec/: The number of sectors written per second. RKB/s:the number of read requests that were issued to the device per SECOND;WKB/s:the Number ofWriterequests that were issued to the device per SECOND;AVGRQ-SZ Average request sector size Avgqu-sz is the length of the average request queue.    There is no doubt that the shorter the queue, the better. Await: The average time (in milliseconds) of processing per IO request.         This can be understood as the response time of IO, generally the system IO response time should be less than 5ms, if greater than 10ms is relatively large. This time includes the queue time and service time, that is, in general, await is greater than SVCTM, their difference is smaller, then the shorter the queue time, conversely, the greater the difference, the longer the queue time, indicating that the system has a problem. SVCTM represents the average service time (in milliseconds) for each device I/O operation. if the value of SVCTM is close to await, indicating that there is little I/O waiting, disk performance is good, and if the value of await is much higher than the value of SVCTM, the I/O queue waits too long for the applications running on the system to become slower.  %util: All processing io time, divided by total statistic time, in the statistical time. For example, if the statistic interval is 1 seconds, the device has 0.8 seconds to process Io, and 0.2 seconds is idle, then the device's%util =0.8/1= the%, so this parameter implies the busy level of the device. Generally, if this parameter is 100% indicates that the device is already running close to full load (of course if it is a multi-disk, even if%util is 100% because of the concurrency of the disk, disk usage may not be the bottleneck)
2. View memory CPU, usage Vmstat 2 10

R means running the queue (that is, how many processes are really allocated to the CPU), the server I am testing is currently idle, there is no program running, when this value exceeds the number of CPUs, there will be a CPU bottleneck. This is also related to top of the load, the general load over 3 is relatively high, more than 5 is high, more than 10 is not normal, the state of the server is very dangerous. The load on top is similar to the run queue per second. If the running queue is too large, it means that your CPU is busy, which generally results in high CPU usage.

b represents the blocking process, which is not much to say, the process is blocked, you understand.

swpd Virtual memory has been used size, if greater than 0, indicates that your machine is out of physical memory, if not the cause of program memory leaks, then you should upgrade the memory or the memory-consuming task to other machines.

Free physical memory size, my machine memory total 8G, the remaining 3415M.

Buff Linux/unix system is used to store, directory inside what content, permissions, etc. of the cache, I machine about more than 300 m

the cache cache is used directly to memorize the files we open, to buffer the files, I have about 300 m of this machine (this is the smart place of Linux/unix, the spare part of the physical memory to do the file and directory cache, is to improve the performance of the program execution, When the program uses memory, buffer/cached is quickly used. )

Si reads the size of the virtual memory from disk every second, if this value is greater than 0, it means that the physical memory is not enough or the memory leaks, to find out the memory process. My machine has plenty of memory and everything is fine.

so per second the virtual memory is written to the size of the disk, if this value is greater than 0, ibid.

The number of blocks received per second by BI block devices, where the block device refers to all the disks and other block devices on the system, the default block size is 1024byte, I have no IO operation on this machine, so it has been 0, but I have been working on copying large amounts of data (2-3T) The machine has seen can reach 140000/s, disk write speed of almost 140M per second

The number of blocks that Bo block devices send per second, such as when we read a file, the Bo will be greater than 0. Bi and Bo are generally close to 0, otherwise the IO is too frequent and needs to be adjusted.

in CPU interrupts per second, including time interrupts

CS per second, such as the number of context switches, such as we call the system function, the context switch, the thread of the switch, but also the process context switch, the smaller the value of the better, too big, to consider the number of threads or processes, such as in Apache and Nginx Web server , we generally do performance testing will carry out thousands of concurrent or even tens of thousands of concurrent testing, the process of selecting a Web server can be the peak of the process or the thread has been down, pressure measurement, until CS to a relatively small value, the process and the number of threads is a more appropriate value. System calls are also, each time the system function is called, our code will enter the kernel space, resulting in context switching, this is very resource-intensive, but also try to avoid frequent calls to system functions. Too many context switches means that most of your CPU is wasted in context switching, resulting in less time for the CPU to do serious work, and the CPU not being fully utilized, is undesirable.

US user CPU time, I used to do encryption and decryption very frequently on the server, you can see us approaching 100,r running queue reached 80 (the machine is doing a stress test, poor performance).

sy System CPU time, if too high, indicates a long system call time, for example, the IO operation is frequent.

ID Idle CPU time, in general, ID + US + sy = 100, generally I think ID is idle CPU usage, US is the user CPU usage, SY is the system CPU utilization.

wt waits for IO CPU time.

3, network situation Ifstat/iftop/dstat

Linux real-time monitoring commands

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.