Linux system and performance monitoring

Source: Internet
Author: User
Tags high cpu usage

1.0 performance monitoring

Performance Tuning is a process of finding system bottlenecks and adjusting the operating system to eliminate these bottlenecks. Many System Administrators think that performance tuning is like cooking with recipes: simply setting a few system parameters can solve a problem. In fact, this is not the case. Performance Tuning is to adjust the various subsystems of the operating system in order to strike a balance between them to achieve the desired optimal performance. These subsystems include CPU, memory, Io, and network.

These subsystems are highly dependent on each other. Any sub-system bottleneck may cause problems in other subsystems. For example:

  • A large amount of memory read requests will fill the memory queue
  • A large amount of network throughput can consume CPU resources
  • To keep the memory queue idle, CPU resources may be exhausted.
  • A large amount of memory write requests to the disk may exhaust CPU and IO Resources

To optimize a system, you must first locate the bottleneck. Sometimes it seems that a problem has occurred to a subsystem, but it may be caused by overload of another subsystem.

 

1.1 determine the application type

To understand where to start optimizing performance, the most important thing is to analyze and understand the performance of a system. Applications are generally divided into two types:

  • Io-type: IO-type applications consume a large amount of memory and their storage systems. This is because Io-type applications (in memory) process a large amount of data. Io applications generally do not consume too much CPU and network resources (unless the storage system is built on the network ). Io applications generally only use the CPU to generate an IO request and then enter the sleep state. Databases are generally considered as IO-type applications.
  • CPU: as the name suggests, CPU-type applications consume a large amount of CPU resources. CPU applications consume CPU resources to complete batch processing or mathematical operations. Large-capacity Web servers, email servers, and various types of rendering servers are generally considered CPU-type applications.

 

1.2 set baseline data

Because the expectations and customization of the system may be different, the performance of the system cannot be generalized. Therefore, you can determine whether the system has performance problems by checking the performance expectations of a system. Therefore, we need to first set a baseline data, that is, to calculate the data of each indicator in an acceptable state. You can use this data for performance comparison later.

The following is an example of performance data comparison between the System baseline and high utilization:

It indicates the last column (ID) of the CPU idle rate. We can know that the baseline CPU idle rate is between 79% and 100%. In the second picture, we can see that CPU resources are not idle and 100% of them are occupied. What needs to be done below is to determine whether such a high CPU usage is expected.

2.0 install monitoring tools

Most Unix systems are released with a series of monitoring tools that have become part of the system since the birth of UNIX. Linux releases these monitoring tools as part of the system or as attachments. Basically, all Linux distributions have installation Packages containing these tools. Although there are many similar open-source and third-party monitoring tools, this article mainly introduces the use of these built-in tools.

This article describes how to use the following tools to monitor system performance:

Tool Description Base Repository
Vmstat All purpose performance tool Yes Yes
Mpstat Provide statistics per CPU No Yes
SAR All purpose Performance Monitoring Tool No Yes
Iostat Provides disk statistics No Yes
Netstat Procides network statistics Yes Yes
Dstat Monitoring Statistics aggregator No In most distributions
Iptraf Traffic monitoring dashboard No Yes
Netperf Network bandwidth Tool No In most distributions
Ethtool Reports on Ethernet Interface Configuration Yes Yes
Iperf Network bandwidth Tool No Yes
Tcptrace Packet Analysis Tool No Yes

 

 

 

3.0 CPU Overview

The CPU utilization depends largely on the type of task to run. The kernel scheduler serves two types of tasks: threads (single or multiple) and interruptions. The scheduler assigns different tasks different priorities. The following lists the priorities of each task:

  • Interrupt-the device notifies the kernel that an operation has been completed. For example, the NIC sends a network packet or a hardware driver generates an IO request.
  • Kernel process-all kernel processes run with this priority.
  • User process-user State. All software applications run as users. In the kernel scheduling mechanism, the user State has the lowest priority.

To help you understand how the kernel manages and schedules these tasks, the following describes some key concepts: Context switching, running queues, and utilization.

 

3.1 Context switching

Most modern processors can run only one single-threaded process or one thread at a time. N threads or N-core processors can run n threads simultaneously. In fact, the Linux kernel regards each core of a multi-core processor as an independent processor. For example, a single-processor dual-core system is a dual-processor system in the Linux kernel.

A standard Linux kernel supports running 50-threads at the same time. If there is only one CPU, the kernel must schedule these threads fairly. Each thread is allocated with a CPU time slice. Once the thread has used up the time slice or is preemptible by tasks with higher priority (such as hardware interruptions), the thread will be placed back in the running queue, A higher-priority task will occupy the CPU. The switching between threads is called context switching.

During context switching, the kernel consumes a certain amount of resources to move the thread from the CPU register to the running queue. Therefore, the more times a system context switches, the more additional consumption the kernel consumes to complete scheduling.

 

3.2 running queue

Each CPU maintains a queue of running threads. Ideally, the scheduler is continuously executing threads. Generally, a thread is either in the sleep status (blocked or waiting for Io to complete) or in the runable status. When the CPU usage is excessive, the kernel scheduler may not be able to execute system commands immediately, and the running threads may be full of running queues. The larger the running queue, the longer the thread waits for execution.

A well-known term used to describe the running queue status is "LOAD ".The system load value is the sum of the number of threads being executed and the number of threads waiting for execution.. Assume that two threads are running in a dual-core system and four threads are waiting for execution in the running queue. The current load value of this system is 6. The top command can calculate the average system load (that is, load average) of 1/5/15 minutes ).

 

3.3 CPU utilization

CPU utilization, that is, the percentage of CPU in use. CPU usage is an important indicator of a system. Most performance monitoring tools divide CPU usage into the following categories:

User time -- percentage of CPU time used to execute user-state threads

System Time -- percentage of the time when the CPU executes the kernel thread and Interrupts

Wait io-percentage of time when the CPU is idle because all threads are waiting for Io

Idle-percentage of time when the CPU is completely idle

Note: The similarities between wait Io and idle are that the thread running queue is empty. The difference is that wait Io is that the thread waiting queue is not empty, while the idle is also empty.

 

4.0 CPU performance monitoring

CPU performance is generally measured in three aspects: Running queue, utilization, and context switching. As mentioned above, Performance performance and baseline data (or expectations) are inseparable. For most systems, some basic performance expectations are as follows:

  • Run queue-each processor should not contain more than 1-3 threads. For example, in a dual-core system, the running queue length cannot exceed 6. The load average value of a system should not be four times the number of cores .)
  • CPU utilization-if the CPU is fully utilized, the following proportion must be reached:
    • User time accounts for 65%-70%
    • System time accounts for 30%-35%
    • Idle accounts for 0%-5%
  • Context switching-the number of context switches is related to the CPU usage. Assuming that the CPU utilization reaches the preceding proportion, a large number of context switches are acceptable.

There are many tools in Linux that can be used to count these indicators. We will first look at vmstat and top.

 

4.1 Use of vmstat

The additional performance overhead of vmstat is very small. Therefore, it is feasible to keep running the tool on a high-load system, even if you do not want to count its performance data for a long time. The tool has two running modes: Statistical mode and sampling mode. The sampling mode counts and Outputs One result at a specified interval. This mode is useful when collecting performance data for a persistent load. The following is an output example of vmstat at a specified interval of 1 second:

The significance of the CPU-related columns in the above output is as follows:

Column name Description
R The length of the running queue, that is, the number of threads waiting for execution.
B Number of threads in the blocking status or waiting for Io completion
In Number of system interruptions
CS Number of context switches
Us CPU execution time of user-state threads
Sys CPU Execution System-state thread usage time, including kernel and interrupt
Wa Percentage of time when the CPU is in the waiting state (CPU wait state means that all threads are blocked or waiting for Io completion)
ID Percentage of time when the CPU is completely idle

 

4.2 Case Study: continuous CPU consumption

In the following case, the system CPU has been fully used up.

From the output above, we can draw the following inferences:

  • There are a lot of interruptions and a few context switches in the system. It seems that a process is requesting access to hardware devices.
  • CPU user mode consumption accounts for more than 85%, and only a small number of context switches further prove that a process has been occupying CPU.
  • The length of the running queue reaches the acceptable upper limit, and even exceeds the upper limit in a few moments.

4.3 Case Study: The scheduler is overloaded.

In the following case, the kernel scheduler has been busy with context switching.

From the above output, we can draw the following inferences:

  • The number of context switches is much greater than the number of interruptions. The kernel must consume a lot of time for context switching.
  • A large number of context switches lead to unbalanced CPU utilization. It can be seen from the extremely low user-mode CPU usage and extremely high wait io-mode CPU usage.
  • Because the CPU is waiting for Io, the running queue begins to accumulate, and the number of threads waiting for Io also begins to accumulate.

4.4 use of the mpstat Tool

If the system has multiple processor kernels, you can use the mpstat command to monitor each core. The Linux kernel uses a dual-core processor as two processors. Therefore, a dual-core dual-processor system is considered to have four processors. Mpstat provides CPU statistics similar to vmstat, but mpstat also provides statistics based on the granularity of CPU cores.

 

4.5 Case Study: inadequate processor Load

In the following case, the system has four CPU cores, and two CPU-consuming processes make full use of these two cores (cpu0 and cpu1, the third core is executing kernel and System Call (cpu3), and the fourth core (cpu2) is idle.

The top command shows that three processes (nobody, MySQL, and Apache) occupy almost one of the CPU cores:

You can determine which process occupies the CPU kernel by using the hdr field of the ps command.

 

4.6 conclusion

CPU performance monitoring includes the following points:

  • Check the running queue to ensure that the length of the running queue for each processor does not exceed 3.
  • Ensure that the CPU usage ratio is between 70/30 and 65/35 in user and system states.
  • If the CPU spends more time in the system state, it may not only be the cause of overload, but try to reset the priority of the process.
  • Io-type processes are more profitable than CPU-type processes ?)

5.0 virtual memory problems

The Virtual Memory uses disks as memory extensions, so that more memory is available. When the memory is insufficient, the kernel will write the memory blocks that have not been used recently to the disk. When this part of memory is accessed again, it will read this part from the disk to the physical memory. These operations are completely transparent to users. Linux applications only see a large amount of memory available, but do not know that some of these "Memory" are stored on disks. There is no doubt that the read/write disk is much slower than the real memory (sequential read/write is about 1000 times slower than the memory), so when the program reads/writes memory, if a large amount of virtual memory is used, the program runs slowly. The part of a disk used as a virtual storage is called the swap space/partition (swap space ).

 

5.1 VM page

Virtual storage is divided by PAGE, and the size of virtual storage pages in X86 architecture is 4 kb. When the kernel reads and writes data through virtual storage, it is performed by page. The kernel sometimes writes memory pages to swap space and file systems. ①

 

5.2 memory Synchronization(Original article: kernel memory paging)

Memory synchronization is a common operation. Do not confuse it with memory swap (page replacement. The process of regularly synchronizing memory to the disk is called memory ing. After running for a period of time, the program may slowly exhaust the memory. At a certain time point, in order to allocate memory to other programs (or the current program), the kernel may need to replace the memory that has not been used recently to the disk. If the memory has been synchronized to the disk, in this case, the swapping operation efficiency is improved (you do not need to synchronize the content to the disk before the SWAp ).

 

5.3 memory recovery mechanism

The memory reclaim mechanism is used to obtain available physical memory. the algorithms used to select the sacrifice pages vary with the internal type. Some page types are as follows:

Unreclaimable -- locked, Kernel used, and retained pages

Swappable -- anonymous Memory Page, Memory Page related to swap Partition

Syncable -- Memory Page related to the file system

Discardable -- Static Page, obsolete page

Other types of pages may be recycled.

There are two main mechanisms for memory recovery: kswapd and "low on memory reclaim ".

 

5.4 kswapd

The role of the kswapd daemon is to ensure that a certain amount of idle memory is available. There are two metrics in the kernel: pages_high and pages_low. If the available memory is lower than pages_low, The kswapd process tries to release the memory. Each time, the 32-page memory is released until the available memory is higher than pages_high.

Kswapd performs the following operations:

  • If the page is not modified, place the page to the idle list.
  • If the page has been modified and is related to the file system, write the page content to the disk ③
  • If the page has been modified and is not related to the file system (the anonymous page corresponds to the swap partition), write the page content to the swap Device

 

5.5 use pdflushComplete memory synchronization (original article: Kernel paging with pdflush)

The daemon pdflush is used to synchronize the memory pages (syncable) related to the file system to the disk. In other words, when the memory copy of a file is modified, pdflush writes the modification back to the disk.

When dirty pages 5 in the memory exceed 10%, pdflush synchronizes these dirty pages to the disk. The value of 10% can be adjusted by the kernel's VM. dirty_background_ratio.

In most cases, the pdflush and memory recovery mechanisms are independent of each other. When the kernel calls low on memory reclaim, lmr also calls pdflush to synchronize expired pages, in addition to other memory release operations.

 

5.6 case analysis:

The vmstat tool also counts the usage of virtual memory while calculating the CPU usage. The following are some indicators related to VM memory in vmstat output:

Column name Description
SWPD The number of currently used VMS, in KB. When the idle memory reaches the lower limit, more data will be replaced with the swap partition.
Free The current available memory of RAM, in KB.
Buff The size of the physical memory as the buffer for read () and write () operations, in KB. ⑥
Cache The size of the physical memory mapped to the process address space, in KB. ⑥
So The amount of data written to the swap partition, in KB.
Si The amount of data written from the swap partition to ram, in KB.
Bo The number of disk blocks in Ram that are switched to the file system or swap partitions
Bi Number of disk blocks that are replaced by a file system or swap partition to ram

The output of the vmstat below shows that a large number of virtual storage is used at the peak of an I/O application:

From the above output, we can get the following inference:

A large number of disk blocks (BI) are switched from the file system to the memory. This can be seen through the increase in cache.

During the last period, although Ram is consumed from disk swap to memory, the free memory remains at 17 MB.

To maintain available idle memory, kswapd misappropriates the read/write cache (buff) and adds the memory to the idle memory table. This can be seen from the reduction of buff.

Then the kswapd process writes some dirty pages to the swap space (SO). This can also be seen through the growth of virtual memory usage (SWPD.

 

5.7 conclusion

The performance monitoring of virtual storage generally includes the following:

  • The fewer pages, the faster the response time. Because the system is using memory instead of disk.
  • Low idle memory is a good phenomenon, indicating that the cache is being used efficiently, unless accompanied by continuous write swap partition operations.
  • If a system is continuously operating on swap partitions, the system memory is tight.

 

:

① Writing memory to the file system: for example, to release the memory used as the buffer, you need to write the memory content to the file system; to write the memory to the swap partition: for example, replacing the real memory with the virtual memory.

② In the original article, memory paging should not refer to the memory paging mechanism. The context should refer to the synchronization from memory to disk.

③ Memory pages related to the file system may be the buffer or cache part in the memory. Memory is mapped to the swap file. It is not sure whether it is related to the file system or swap partition.

④ Similarities and associations between kswapd and pdflush:

Similarities: both write memory data to the disk.

Difference: the two have different purposes of writing memory data to the disk: kswapd is used to release the memory, but this part of data must be saved to the disk before the memory is released; pdflush is used to synchronize memory and disk data.

Association: As mentioned in 5.2, the pdflush operation improves the efficiency of kswapd.

⑤ The original text is called dirty page, which is usually translated as dirty pages. It indicates that the cache requires a new memory page than the corresponding disk content.

⑥ The buffer and cache explanations in the original article are inconsistent with those found in many materials. I personally think the explanations in the original article are a bit inappropriate.

Buffer: memory is used to cache the disk block and the metadata of the file. This part of data can be directly accessed through Block devices and block numbers.

Cache: memory is used to cache the file data, which is accessed through inode.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.