Performance anomaly targeting and performance monitoring on Linux

Last Update:2016-11-17 Source: Internet

Author: User

Tags disk usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction: Most of the services are running on Linux, Linux is now a very wide range of applications, but there will still be a lot of problems, we will discuss our performance monitoring metrics, performance monitoring is nothing more than I/O, memory, cpu,tcp connection number, network, Process or thread to start, the command used to have Iostat,vmstat,sar,mpstat,netstat,ss,iftop,free,pstree/ps,pidstat,top, (uptime) below to further down the bar.

One, disk I/O (iostat)

A lot of the data on our machine is stored on disk, and much of the data we read is going to interact with the disk, but the disk is also a low-speed device, and many times it is blocked, so the monitoring of disk I/O is important. We use Iostat to diagnose the condition of the disk.

TPS: The number of times per second that the device transmits, indicating how many I/O requests per second

BLK_READ/S: The amount of data read from the device per second

Blk_wrtn/s:1116.www.qixoo.qixoo.com The amount of data written to the device per second

Blk_read: Total amount of data read

BLK_WRTN: Total amount of data written

%user: Represents the CPU load for user-state processes

%nice: Represents the CPU load used by the priority process

%system: Represents the CPU load used by the kernel-state process

%iowait: CPU load on behalf of CPU waiting for I/O

%steal: Represents the stolen CPU load situation, which is used in virtualization technology

%idle: Represents the Idle CPU load situation

Iostat also has a common parameter option-X, which represents the extended information

RRQM/S: How much of this device-dependent read request is merged (multiple I/O merging operations) per second

WRQM/S: How much of this device-related write request per second has been merge

R/S: Number of Read requests sent to the device per second

W/S: Number of write requests sent to the device per second

RSEC/S: Number of Read device sectors per second

WSEC/S: Number of write device sectors per second

Avgrq-sz: The size of the average request sector

Avgqu-sz: The length of the average request queue

Await: Average time of processing per I/O request (wait time)

R_await: Average time of processing per read I/O request

W_await: The average time to process each write I/O request

SVCTM: Represents the average service time per I/O operation. If the SVCTM value is close to the await value, it indicates that I/O has almost no wait, and if the value of await is much higher than the value of SVCTM, the I/O queue waits too long

%util: The total amount of time spent on the I/O operation, which is the percentage of CPU consumed, during the statistical time. For example, the statistical interval is 1s, then this device has 0.65s in processing I/O, 0.35s is idle. So the%util=0.65/1=65% of this device, generally, if the parameter is 100% means that the device is already running close to full load (of course if it is a multi-disk, even if%util is 100% because of the concurrency of the disk, so disk usage may not be the bottleneck)

Two, memory (free)

In a Linux system we look at memory usage. Use the free command to view

The first line of information (which we can consider to be viewed from the operating system level)

Total: Overall physical memory size

Used: the size already allocated

Free: No size allocated

GKFX: The size of shared memory, primarily for IPC communication

Buffers: Buffering for block devices

Cached: For file content buffering, which is caching

"Cache" is the partition in memory, as a buffer between the process and the hard disk, the process writes data to the cache, when those data need to read, and then go directly to the "high-speed" cache to read, and not go to the "dirt road" hard disk to read, so that greatly accelerate performance

Here, buffer is actually the metadata that stores our data (including the directory name, file size, file storage block, modification time, permissions, etc.), while the cache holds the files we have read recently.

The third line of information (which we can consider to be viewed from the application level)

The-/+ Buffers/cache here are-buffers/cache and +buffers/cache two sections respectively.

-buffers/cache = Used (first row)-buffers-cached is actually the "physical memory" of "real use" on the current program

+buffers/cache = buffers+cached means temporarily "lends" the memory size used by the system as a "buffer"

Used= (+buffers/cached) + (-buffers/cached)

So from the application level, available memory =free memory+buffers+cached

We can view the details in the following way.

~ Cat/proc/meminfo

memtotal:1020128 KB

memfree:670772 KB

buffers:97780 KB

cached:100980 KB

swapcached:0 KB

active:164988 KB

inactive:117296 KB

Active (anon): 83536 KB

Inactive (anon): KB

Active (file): 81452 KB

Inactive (file): 117136 KB

unevictable:0 KB

mlocked:0 KB

swaptotal:0 KB

swapfree:0 KB

dirty:92 KB

writeback:0 KB

anonpages:83504 KB

mapped:17500 KB

shmem:172 KB

slab:46696 KB

sreclaimable:28652 KB

sunreclaim:18044 KB

kernelstack:1744 KB

pagetables:2636 KB

nfs_unstable:0 KB

bounce:0 KB

writebacktmp:0 KB

commitlimit:510064 KB

committed_as:343800 KB

vmalloctotal:34359738367 KB

vmallocused:7112 KB

vmallocchunk:34359727304 KB

hardwarecorrupted:0 KB

anonhugepages:36864 KB

hugepages_total:0

hugepages_free:0

hugepages_rsvd:0

hugepages_surp:0

hugepagesize:2048 KB

directmap4k:8184 KB

directmap2m:1040384 KB

This article permanently updates the link address: http://www.linuxidc.com/Linux/2016-11/137022.htm

Performance anomaly targeting and performance monitoring on Linux

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More