Monitor CPU (i)

Source: Internet
Author: User

Do system operation, monitoring CPU is a frequent thing, then need to pay attention to what indicators?????

Run Queue Statistics

If you want to see how busy the CPU is, we can make a simple judgment by looking at the status of the process in the system, for example, by looking at the number of runnable processes and the number of blocked processes to CPU utilization.

1) runnable

If a process is in the runnable state, this means that it and other process in the same runnable state wait for CPU time instead of getting CPU time immediately, which is what we normally call "ready" state, which means it can be executed at any time but not in execution.

When the process is in the runnable state, it indicates that it is waiting for CPU time to execute, and that the state is not consuming CPU time, and that the waiting queue they form is called the larger the Runqueue,run queue, which indicates the longer the queue waiting. Linux scheduling process, from the runnable queue (runqueue), select a process next execution, then this process will get CPU time, it becomes running state.

The runnable and running states of the process are represented in the Linux system with the task_running global variable. Task_running indicates that the process is being executed by the CPU, or that it is ready to be dispatched by the scheduler at any time, and that it is in a ready state (runnable) if it is not executed by the CPU at this time, if the process is being executed by the CPU. It is said to be in the execution state (running). When a process runs in kernel code, we call it in the kernel state, and when a process is executing the user's own code, we call it user-state. When the system resource is already available, the process is awakened and ready to run, which is the ready state. These states are represented in the kernel in the same way, and are called task_running states. When a process has just been created, it is in the task_running state.

We often talk about the load of the Monitoring System State (Chinese translation is load), this is the task_running value, it refers to the sum of the process in running and runnable. For example: if there are two processes in running and three are waiting to run (runnable), then the system's load is five. Loadaverage is the average of load in a given time, and the 1-minute, five-minute, and 15-minute periods of the typical loadaverage display of three digits.

2) blocked waiting for a event to complete

A blocked state process may be waiting for the data obtained by an I/O operation, or a result of a system call.

Context Switches

Most of the current CPUs can only run one process at a time, although there are some CPUs, such as hyper-threading technology, that can run more than one process,linux at a time to treat this CPU as multiple single-threaded CPUs.

The Linux kernel constantly switches between different processes, creating an illusion that a single CPU handles multiple tasks at the same time, and that switching between different process is called a context switch. When the system does a context switch, the CPU holds all oldprocess context information and obtains all the context information of the newprocess. The context information includes a large number of Linux tracks for each process information, especially some resources: which process is executing, what memory is allocated, which files are opened, and so on. Switching context triggers a large amount of information movement, which is a relatively high overhead and if possible keep a small context switches.

In order to reduce the context switches as much as possible, you need to know how they are produced? First kernel the dispatch trigger contextswitches. To ensure that each process equals shared CPU time kernel periodically interrupts the process of running, if appropriate, the kernel scheduler will start an additional process instead of letting the current process continue, Each periodic interrupt or timed interrupt can trigger Contextswitch, and the number of timed interrupts per second varies depending on the architecture and the different kernel versions. One simple way to get the number of interrupts per second is by monitoring the/proc/interrupts file to see the following example:

[[email protected] asm-i386]# cat/proc/interrupts |grep timer; Sleep 10; cat/proc/interrupts | grep timer

0:24060043 Xt-pic Timer

0:24070093 Xt-pic Timer

Above you can see the number of times the timer changes during the specified time, and the number of interrupts generated per second is 1000 times.

In addition, if your context switch is much larger than the timer interrupt, then Contextswitch is more likely to be an I/O request or other long-time system call (such as sleep). When an application requests an operation that cannot be implemented immediately, kernel starts the context switch operation: deposit the requested process and try to switch to the other runnableprocess, which will keep the CPU working.

Interrupts

In other respects, the CPU receives the interrupt request from the hardware driver, which is usually triggered when a drive has a time when it needs to be kernel operation. For example, if a disk controller obtains a block of data from disk and kernel needs to read the block, then the disk controller triggers an interrupt. Kernel receives each interrupt, if the interrupt is registered, then there will be an interrupt handler to run the interrupt, or the interrupt will be ignored. In the system, the interrupt processor has a very high priority and executes very quickly. Many times, some interrupt processing does not require a high processing priority, so there are also soft-interrupthandler. If there is a lot of interruptions, kernel takes a lot of time to deal with interrupts. You can check/proc/interrupts to know which CPU the interrupt occurred on.

CPU Utilization

Cpuutilization, a very intuitive concept, the CPU has 7 states at any one time:

1) idle, indicating that the CPU is idle and waiting for assignment

2) User, indicates the CPU is running the user's process

3) system, indicating that the CPU is performing kernel work

4) Nice, which represents the time the CPU spends on a process that has been changed by NICE's priority (note: The process that is changed by the Nice command is only a process that has a nice value of negative. The time spent on tasks that have been prioritized by the Nice command will also be counted within the system and user time, so the whole time may add up to more than 100%.

5) Iowait, indicating when the CPU waits for the IO operation to complete

6) IRQ, indicating the time that the CPU overhead is in response to a hard interrupt

7) Softirq, indicating the CPU overhead in response to a soft interrupt time


This article from "Yun Weibang" blog, reproduced please contact the author!

Monitor CPU (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.