System Resource Monitoring

Last Update:2016-05-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective:

system resources monitor the CPU, memory, disk and network of the general monitoring system. The system is divided into Windows and Linux.

First, Linux system resources monitoring common commands and tools

Top

The top command is a common performance analysis tool under Linux, which shows the resource usage of each process in real-time, and describes how to use it in detail below.

1 Statistical Information Area2Top- on: .: -Up1: A,1User, load Average:0.06,0.60,0.483Tasks: inTotal1Running -Sleeping,0Stopped0Zombie4Cpu (s):0.3% US,1.0% Sy,0.0% Ni,98.7% ID,0.0% WA,0.0% Hi,0.0%si5 mem:191272k Total, 173656k used, 17616k free, 22052k buffers6 swap:192772k Total, 0k used, 192772k free, 123988k cached7Process Information Area8PID USER PR NI VIRT RES SHR S%cpu%MEM time+COMMAND9 1379Root - 0 7976 2456 1980S0.7 1.3 0:11.03sshdTen 14704Root - 0 2128 980 796R0.7 0.5 0:02.72Top One 1Root - 0 1992 632 544S0.0 0.3 0:00.90Init A 2Root the  + 0 0 0S0.0 0.0 0:00.00ksoftirqd/0 - 3Root RT0 0 0 0S0.0 0.0 0:00.00watchdog/0

Statistics Area:
The first five elements are the statistical information of the system as a whole. The first line is the task queue information, with the execution result of the uptime command. The contents are as follows:
01:06:48 current time
up 1:22 system run time, format: Min
1 user currently logged on users
load average:0.06, 0.60, 0.48 system load, which is the average length of the task queue. The
three values are 1 minutes, 5 minutes, and 15 minutes ago to the current average.
The second to third behavior process and CPU information. When there are multiple CPUs, the content may be more than two lines. The contents are as follows:
tasks:29 Total Processes
1 running number of processes running
sleeping number of processes in sleep
0 stopped stopped processes
0 zombie number of zombies processes
Cpu (s) : 0.3% US user space consumes CPU percentage
1.0% SY core space consumes CPU percentage
0.0% process in NI user process space has changed priority CPU percentage
98.7% ID Idle CPU percent
0.0% wa wait for input output Percent of CPU time
last two behavior memory information. The contents are as follows:
mem:191272k total Physical memory
173656k used total amount of physical memory used
17616k free memory total
22052k buffers used as memory amount for kernel cache
Swap : 192772k Total swap area totals
0k used the total amount of swap used
192772k free swap area total
123988k cached buffer swap area total.

Process Information Area:
The details of each process are shown below the statistics area. Let's start by understanding the meaning of the columns.
Column name meaning
PID Process ID
Username of user Process Owner
PR-Priority
NI nice value. Negative values indicate high priority, positive values indicate low priority
The total amount of virtual memory used by the VIRT process, in kilobytes. Virt=swap+res
The size, in kilobytes, of the physical memory used by the RES process and not swapped out. Res=code+data
SHR shared memory size, in kilobytes
S process state.
d= non-disruptive sleep state
R= Run
S= Sleep
t= Tracking/Stopping
z= Zombie Process
%cpu percentage of CPU time that was last updated to current
Percentage of physical memory used by the%MEM process
Total CPU time used by the time+ process, Unit 1/100 sec
Command name/command line

The following are the column names that are not displayed by default:
PPID Parent Process ID
Ruser Real User Name
User ID of the UID process owner
Group Process Owner's name
The terminal name of the TTY startup process. Processes that are not started from the terminal are displayed as?
P last used CPU, only meaningful in multi-CPU environment
The total CPU time, in seconds, used by the duration process
The swap process uses the amount of virtual memory that is swapped out, in kilobytes.
The amount of physical memory the code executable consumes, in kilobytes per kb
The amount of physical memory that is used outside the data executable code (segment + stack), in kilobytes
Nflt Number of page faults
NDRT the number of pages that were modified the last time it was written to.
Wchan If the process is sleeping, the system function name in sleep is displayed
Flags task flag, reference sched.h

Free

The free command displays the idle, used physical memory and swap memory in the Linux system, and the buffer used by the kernel.

 1  [[email protected] home]# free-m  2   Total used free s hared buffers Cached  3  Mem: 1006< /span> 988  0  96  72  4 -/+ buffers/cache: 819  1865  Swap: 2015  1  2014

The 3rd line represents the data from a system perspective, where used contains buffers and cached
The 4th line is the data represented from the application's perspective, used is the memory occupied by the real application
Line 3rd used = 4th row of used+ 3rd row of buffers and cached
The 5th Behavior Swap area information, respectively, is the total amount of the exchange, the amount of usage (used), and the number of free swap areas
In the case of an application server, the general only looks at line 4th: +buffers/cache, that is, there is too little memory for the application, and it is time to consider optimizing the program or adding memory.

Iostat

You can use the Iostat tool to view the number of process IO requests, the time it takes for the system to process IO requests, and then to analyze whether there are bottlenecks in IO aspects of the process and operating system interactions.
The following example is used with the Iostat command, which shows how to use Iostat to view the status of IO request, the System IO processing capability, and the meaning of the fields in the command execution results.
1. Do not add the option to execute Iostat

1 Linux # Iostat2Linux2.6.16.60-0.21-SMP (Linux) ./ A/ A3 4AVG-CPU:%user%nice%system%iowait%steal%Idle5 0.07 0.00 0.05 0.06 0.00 99.816 7Device:tps blk_read/s blk_wrtn/s Blk_read blk_wrtn8Sda0.58 9.95 37.47 6737006 253774009Sdb0.00 0.00 0.00 824 0

The iostat is executed separately, displaying the statistics from the system boot to the current execution time. The above output, in addition to the top indicates the system version, host name and date of a row, there are two parts:

AVG-CPU: Overall CPU usage statistics, for multi-core CPUs, here is the average of all CPUs
Device: IO statistics for each disk device
For a row of CPU statistics, we mainly look at the value of iowait, which indicates when the CPU waits for the IO request to complete. The columns in the device have the following meanings:
Device: Device name shown in SDX form
TPS: Number of IO read and write requests per second process
blk_read/s: Number of Read sectors per second (512bytes for one sector)
blk_wrtn/s: Number of write sectors per second
Blk_read: Total number of read sectors during sampling interval
BLK_WRTN: Total number of write sectors during sampling interval
We can use the-C option to display the results of the AVG-CPU section separately, using the-D option to display the information in the device section separately.
2. Specify the sampling time interval and the number of samples
Linux # iostat-d 1 2

1Linux2.6.16.60-0.21-SMP (Linux) ./ -/ A2Device:tps blk_read/s blk_wrtn/s Blk_read blk_wrtn3Sda0.55 8.93 36.27 6737086 273677284Sdb0.00 0.00 0.00 928 05 6Device:tps blk_read/s blk_wrtn/s Blk_read blk_wrtn7Sda2.00 0.00 72.00 0  the8Sdb0.00 0.00 0.00 0 0

The above command output device information, sampling time is 1 seconds, sampling 2 times, if not specify the number of samples, then Iostat will always output the sampling information, until press "CTRL + C" Exit command.
Note that the 1th sampling information is the same as the effect of performing iostat alone, which is the statistics from the system to the current execution time.
3. Display read and write information in kilobytes (-K option)
We can use the-K or-m option to specify that part of the output of the iostat is in kilobytes or megabytes instead of sector number

1Linux # iostat-d-k2Linux2.6.16.60-0.21-SMP (Linux) ./ -/ A3 4Device:tps kb_read/s kb_wrtn/s Kb_read kb_wrtn5Sda0.55 4.46 18.12 3368543 136860966Sdb0.00 0.00 0.00 464 0

In the output above, the values for kb_read/s, KB_WRTN/S, Kb_read, and Kb_wrtn are all in kilobytes, compared with the number of sectors, where the value is half of the original value (1KB=512BYTES*2)
4. More detailed IO statistics (-x option)
To display more detailed IO device statistics, you can use the-X option, which typically turns on the-X option when analyzing an IO bottleneck:

1Linux # Iostat-x-k-d12Linux2.6.16.60-0.21-SMP (Linux) ./ -/ A3 4 ...5device:rrqm/s wrqm/s r/s w/s rkb/s wkb/s avgrq-sz Avgqu-szawaitSvctm%util6Sda0.00 9915.00 1.00 90.00 4.00 34360.00 755.25 11.79 120.57 6.33 57.60

The above columns have the following meanings:
rrqm/s: The number of times per second read requests to the device are merged, and the file system merges requests to read the same block
wrqm/s: Number of times per second write requests to the device are merged
r/s: Number of reads completed per second
w/s: Number of writes completed per second
rkb/s: Amount of Read data per second (KB)
wkb/s: Amount of Write data per second (in kilobytes)
Avgrq-sz: The average amount of data per IO operation (units of sectors)
Avgqu-sz: Average queue Length of IO requests waiting to be processed
await: Average per IO request wait time (including wait time and processing time, in milliseconds)
SVCTM: Average processing time per IO request (in milliseconds)
%util: The time ratio used for IO operations in cycles, that is, the time ratio of IO queue non-empty
For the example output above, we can obtain the following information:
1. Write about 30M data (wkb/s value) to disk per second
2.91 IO operations per second (R/S+W/S), with write as the main body
3. The average time to wait for each IO request to be processed is 120.57 milliseconds, processing takes 6.33 milliseconds
4. In the queue of IO requests waiting to be processed, there are an average of 11.79 requests residing
There is also a connection between the above values, and we can calculate other values by some value, for example:
Util = (r/s+w/s) * (svctm/1000)
For the above examples are: Util = (1+90) * (6.33/1000) = 0.57603

System Resource Monitoring

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

System Resource Monitoring

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

System Resource Monitoring

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support