Introduction: Most of the services are running on Linux, Linux is now a very wide range of applications, but there will still be a lot of problems, we will discuss our performance monitoring metrics, performance monitoring is nothing more than I/O, memory, cpu,tcp connection number, network, Process or thread to start, the command used to have Iostat,vmstat,sar,mpstat,netstat,ss,iftop,free,pstree/ps,pidstat,top, (uptime) below to further down the bar.
One, disk I/O (iostat)
A lot of the data on our machine is stored on disk, and much of the data we read is going to interact with the disk, but the disk is also a low-speed device, and many times it is blocked, so the monitoring of disk I/O is important. We use Iostat to diagnose the condition of the disk.
TPS: The number of times per second that the device transmits, indicating how many I/O requests per second
BLK_READ/S: The amount of data read from the device per second
Blk_wrtn/s:1116.www.qixoo.qixoo.com The amount of data written to the device per second
Blk_read: Total amount of data read
BLK_WRTN: Total amount of data written
%user: Represents the CPU load for user-state processes
%nice: Represents the CPU load used by the priority process
%system: Represents the CPU load used by the kernel-state process
%iowait: CPU load on behalf of CPU waiting for I/O
%steal: Represents the stolen CPU load situation, which is used in virtualization technology
%idle: Represents the Idle CPU load situation
Iostat also has a common parameter option-X, which represents the extended information
RRQM/S: How much of this device-dependent read request is merged (multiple I/O merging operations) per second
WRQM/S: How much of this device-related write request per second has been merge
R/S: Number of Read requests sent to the device per second
W/S: Number of write requests sent to the device per second
RSEC/S: Number of Read device sectors per second
WSEC/S: Number of write device sectors per second
Avgrq-sz: The size of the average request sector
Avgqu-sz: The length of the average request queue
Await: Average time of processing per I/O request (wait time)
R_await: Average time of processing per read I/O request
W_await: The average time to process each write I/O request
SVCTM: Represents the average service time per I/O operation. If the SVCTM value is close to the await value, it indicates that I/O has almost no wait, and if the value of await is much higher than the value of SVCTM, the I/O queue waits too long
%util: The total amount of time spent on the I/O operation, which is the percentage of CPU consumed, during the statistical time. For example, the statistical interval is 1s, then this device has 0.65s in processing I/O, 0.35s is idle. So the%util=0.65/1=65% of this device, generally, if the parameter is 100% means that the device is already running close to full load (of course if it is a multi-disk, even if%util is 100% because of the concurrency of the disk, so disk usage may not be the bottleneck)
Two, memory (free)
In a Linux system we look at memory usage. Use the free command to view
The first line of information (which we can consider to be viewed from the operating system level)
Total: Overall physical memory size
Used: the size already allocated
Free: No size allocated
GKFX: The size of shared memory, primarily for IPC communication
Buffers: Buffering for block devices
Cached: For file content buffering, which is caching
"Cache" is the partition in memory, as a buffer between the process and the hard disk, the process writes data to the cache, when those data need to read, and then go directly to the "high-speed" cache to read, and not go to the "dirt road" hard disk to read, so that greatly accelerate performance
Here, buffer is actually the metadata that stores our data (including the directory name, file size, file storage block, modification time, permissions, etc.), while the cache holds the files we have read recently.
The third line of information (which we can consider to be viewed from the application level)
The-/+ Buffers/cache here are-buffers/cache and +buffers/cache two sections respectively.
-buffers/cache = Used (first row)-buffers-cached is actually the "physical memory" of "real use" on the current program
+buffers/cache = buffers+cached means temporarily "lends" the memory size used by the system as a "buffer"
Used= (+buffers/cached) + (-buffers/cached)
So from the application level, available memory =free memory+buffers+cached
We can view the details in the following way.
~ Cat/proc/meminfo
memtotal:1020128 KB
memfree:670772 KB
buffers:97780 KB
cached:100980 KB
swapcached:0 KB
active:164988 KB
inactive:117296 KB
Active (anon): 83536 KB
Inactive (anon): KB
Active (file): 81452 KB
Inactive (file): 117136 KB
unevictable:0 KB
mlocked:0 KB
swaptotal:0 KB
swapfree:0 KB
dirty:92 KB
writeback:0 KB
anonpages:83504 KB
mapped:17500 KB
shmem:172 KB
slab:46696 KB
sreclaimable:28652 KB
sunreclaim:18044 KB
kernelstack:1744 KB
pagetables:2636 KB
nfs_unstable:0 KB
bounce:0 KB
writebacktmp:0 KB
commitlimit:510064 KB
committed_as:343800 KB
vmalloctotal:34359738367 KB
vmallocused:7112 KB
vmallocchunk:34359727304 KB
hardwarecorrupted:0 KB
anonhugepages:36864 KB
hugepages_total:0
hugepages_free:0
hugepages_rsvd:0
hugepages_surp:0
hugepagesize:2048 KB
directmap4k:8184 KB
directmap2m:1040384 KB
This article permanently updates the link address: http://www.linuxidc.com/Linux/2016-11/137022.htm
Performance anomaly targeting and performance monitoring on Linux