CentOS Performance Monitoring series three: monitoring tools atop detailed

Last Update:2015-06-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

Linux is increasingly being used as a server's operating system for its stability (of course, one would say: Linux is just the OS kernel:). But if we use Linux as the underlying operating system, will we be able to ensure that our services are stable over a 24x7 manner? Not also, to know that business functions are implemented by the program running on the system, to achieve the stability of business functions, the choice of Linux is only the first step, we do more work is not to let business procedures become stable short board.

When our server problems, external performance is the business function can not be provided, internal reasons, from the point of view of the program, may be the business process problems (bugs in the program itself), may also be the server on the human error (improper execution of scripts or commands); From the perspective of system resources, This could be CPU preemption, memory leaks, disk IO read and write exceptions, network exceptions, and so on . After the problem, how do we proceed with the analysis in the face of various possible causes? Do we have any tools for problem locating?

Atop introduction

This article is to introduce the atop is a tool for monitoring the Linux system resources and processes, it records the operating state of the system at a certain frequency, the collected data contains the system resources (CPU, memory, disk and network) usage and process operation, and can be stored in a log file in the disk, After the server has a problem, we can obtain the corresponding atop log file for analysis. Atop is an open source software, where we can get its source code and RPM installation package.

Atop how to use

After installing atop, we can see the current operation of the system by typing the "atop" command at the command line:

Installation: Yum install atop

System Resource monitoring field meaning

A number of fields and values are listed, what are the meanings of each field? What should we look at? The meaning of each of these fields is relative to the sampling period, so let's focus on the top half of the display.

Atop column: This column shows the hostname, information sample date, and point in time

PRC column: This column shows the overall operation of the process

SYS, USR fields indicate the running time of the process in the kernel state and the user state, respectively
#proc field indicates the total number of processes
#zombie field indicates the number of zombie processes
The #exit field indicates the number of processes that exited during the atop sampling period

CPU columns: This column shows the CPU as a whole (that is, multicore CPUs as a whole CPU Resource) usage, we know that the CPU can be used to execute processes, processing interrupts, can also be idle (two kinds of idle state, one is the active process waiting for disk IO causing the CPU to idle, the other is completely idle)

SYS, USR field indicates the percentage of CPU time that the process is in the kernel state, the user state, when the CPU is being used to process processes
The IRQ field indicates the percentage of time that the CPU is being used to process interrupts
The Idle field indicates the percentage of time that the CPU is in full idle state
The Wait field indicates the percentage of time that the CPU is in the "process waiting for disk IO to cause CPU idle" state

Each field in the CPU column indicates that the value is added as n00%, where n is the number of CPU cores.

CPU column: This column shows the usage of a core CPU, each field meaning can refer to the CPU column, each field value added result is 100%

CPL column: This column shows the CPU load condition

AVG1, Avg5, and Avg15 fields: the average number of processes running in the queue in the last 1 minutes, 5 minutes, and 15 minutes
The CSW field indicates the number of context exchanges
Intr field indicates the number of interrupt occurrences

MEM column: This column indicates memory usage

The Tot field indicates the total amount of physical memory
The free field indicates the size of the idle memory
The Cache field indicates the amount of memory used for the page cache
The Buff field indicates the amount of memory used for the file cache
The Slab field indicates the amount of memory that the system kernel occupies

SWP column: This column indicates the usage of the swap space

The Tot field indicates the total swap area
The free field indicates the size of the idle swap space

Pag column: This column indicates the virtual memory paging condition

Swin, swout field: Swap in and out memory pages

DSK column: This column indicates disk usage, and each disk device corresponds to a column, and if there is a SDB device, increase the list of DSK information

SDA field: Disk device identity
Busy field: Disk busy time scale
Read, write fields: number of reading and writing requests

NET columns: Multi-column net shows network conditions, including transport layer (TCP and UDP), IP layer, and network port information for each activity

The XXXi field indicates the number of packets received for each layer or active network port
The Xxxo field indicates the number of packets for each layer or active network port

Process View

To present process information more comprehensively, atop provides a variety of process views.

Default view (Generic information)

Entering the atop information interface, we see the default view of the process information (lower part), and press the G key to jump from the other view to the default view.

Memory View (consumption)

The memory view shows how the process uses memory and presses the M key to enter the memory view.

The lower half shows the amount of virtual memory space (vsize), Memory space (rsize) size consumed by each process, and the amount of virtual memory and physical memory growth (Vgrow, Rgrow) in the previous sample cycle, which indicates the amount of physical memory the process occupies.

From the Pag column information, we can know that at this time the system memory load is high, page swap occurs, from the process view of the Vgrow and Rgrow columns can be seen in the VirtualBox process occupies a large amount of memory growth, some processes occupy a decrease in the amount of RAM (Vgrow or Rgrow field is negative), Frees up space for the VirtualBox process.

Commands view (command line)

Press C to enter the command view, which shows the commands that correspond to each process.

Sometimes one of our "careless" colleagues executes a script or command that makes the system resource usage abnormally high, and we can easily find the command that causes the exception through the atop command view.

Atop log

Each time-point sampling page is combined to form a atop log file that we can use to view the log file using the "atop-r XXX" command. So what is the format for saving atop log files?

For how to save the atop log file, we can do this:

Save a atop log file every day, which records the information of the day
Log files are named "ATOP_YYYYMMDD"
Set the log expiration period, automatically delete the log file before a period of time

In fact, atop developers have provided the above log Save method, the corresponding atop.daily script can be found in the source directory. In the atop.daily script, we can change the atop information sampling period by modifying the interval variable (the default is 10 minutes), and changing the number of days to save the journal by modifying the values in the following command (default is 28 days):

(Sleep 3; find $LOGPATH-name ' atop_* '-mtime +28-exec rm {} \;) &

Finally, we modify the cron file to execute the atop.daily script every morning:

0 0 * * * root/etc/cron.daily/atop.daily

Summary

This paper introduces the Linux system resource and process monitoring tool atop, analyzes the meaning of some fields in atop information and the process view, and finally describes how to save atop log files.

The atop tool adjusts the displayed fields based on the size of the terminal interface, so you may see different parts of the field when you use atop.

CentOS Performance Monitoring series three: monitoring tools atop detailed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CentOS Performance Monitoring series three: monitoring tools atop detailed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

CentOS Performance Monitoring series three: monitoring tools atop detailed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support