In Linux, the meaning of CPU load and CPU usage indicates that the Nagios alarm information was viewed yesterday. It was found that the CPU load on one of the servers was too heavy and the machine was CentOS. Information: 2011-2-15 (Tuesday) 17: 50 WARNING-load average: 9.73, 10.67, 10.49
There are also alerts sent in the previous two hours: (Tuesday) 16: 50 WARNING-load average: 10.52, 10.10, 10.062011-2-15 (Tuesday) 15: 40 WARNING-load average: 8.27, 9.23, 9.48
I. What do the three parameters of alarm information mean? 9.73, 10.67, and 10.49 represent the average CPU load of the previous minute, five minutes, and fifteen minutes respectively. The most important metric is the last number, that is, the average CPU load of the previous 15 minutes, the smaller the number, the better. The so-called CPU load refers to the length of the task queue within a period of time. Generally speaking, it refers to the number of tasks in use or waiting for CPU usage in a period of time. 2. In addition to Nagios, what other tools can be used to view CPU loads? The top command and uptime command, especially the top Command, can be used to view CPU load. Iii. What is the understanding of CPU load? Is it CPU usage? The difference between CPU load and CPU utilization is that they are two different concepts, but their information can be displayed in the same top command. CPU utilization shows the percentage of CPU occupied by the program in real time during running, while the CPU load shows the average number of tasks that are being used and waiting for CPU usage for a period of time. High CPU utilization does not mean that the load is huge. There is an article on the Internet that provides an interesting analogy. I will explain the difference between the two by making a phone call. In a public phone booth, one person is making a call, and four people are waiting. Each person is limited to one minute. If someone does not finish the call within one minute, they can only wait for the next round. The number of calls here is equivalent to the CPU, and the number of people waiting for a call is equivalent to the number of tasks. During the use of phone booths, some people may leave after phone calls. Some people choose to re-queue after phone calls are not completed, and some new people will queue up here, this change is equivalent to an increase or decrease in the number of tasks. To measure the average load, we count the number of people in five seconds, and take the average value for the statistics in 1st, 5, and 15 minutes, thus, the average load is 1st, 5, and 15 minutes. Some people pick up the phone and call it for one minute. Some people may be looking for a phone number thirty seconds ago, or are hesitant to call the phone number. They will only be making a call thirty seconds later. If we regard the phone number as a CPU and the number of people as a task, we can say that the CPU usage of the previous person (task) is high, and the CPU usage of the next person (task) is low. Of course, the CPU will not work in the first thirty seconds, but will rest in the next thirty seconds. It just means that some programs involve a lot of computing, so the CPU utilization is high, some programs involve a small amount of computing, and the CPU utilization is naturally low. However, no matter whether the CPU utilization is high or low, it is not necessarily related to the number of tasks in the queue. 4. After understanding the meaning of CPU load, how can we reduce the server's CPU load? The simplest way is to replace a server with better performance. Don't just think about improving the CPU performance. It's useless. To make the best performance out of the CPU, you also need the cooperation of other hardware and software. When other aspects of the server are properly configured, the number of CPUs and the number of CPU cores (that is, the number of cores) will affect the CPU load, because the task is finally allocated to the CPU Core for processing. Two CPUs are better than one CPU, and two-core CPUs are better than single-core CPUs. Therefore, we need to remember that, apart from the differences in CPU performance, the CPU load is calculated based on the number of cores! There is a saying: "How many kernels are there? How much load is there ". 5. What is the CPU load shared to each CPU at the beginning of this article? It depends on the total number of kernels on my server. In Linux, There Is A/proc directory that stores the virtual ing of the current running system. One of the files is cpuinfo, which stores CPU information. We can directly open the view or filter keywords for viewing. Because there are many files, we usually need to filter keywords. The/proc/cpuinfo file displays information by logical CPU rather than by actual CPU segments. Each logical CPU occupies one section. The first logical cpu id starts from 0. First, we need to understand this. What is logical CPU. To understand the CPU information in this file, there are several related concepts to know: processor: Logical CPU id model name: Real CPU model Information physical id: Real CPU and cpu cores: number of real CPU cores $> grep 'model name'/proc/cpuinfo | uniqmodel name: Intel (R) Xeon (R) CPU E5320 @ 1.86 GHz $> grep 'physical id'/proc/cpuinfo | sort | uniq | wc-l2 $> grep 'cpu cores'/proc/cpuinfo | uniq2, the CPU model of this server is Intel (R) Xeon (R) CPU E5320, Dual CPU. Each CPU is dual-core, which is equivalent to four cores on the server. As we mentioned above, the CPU load is calculated based on the number of CPU cores, so the average number of loads in the past 15 minutes is 10.49. We can conclude that the load of each CPU on this server is 5.245, distributed to the kernel. The load per kernel is about 2.6. Is this load reasonable? It depends on what the ideal CPU load standards look like. 6. What is the CPU load? I personally agree that the CPU load smaller than or equal to 0.7 is an ideal situation. No matter how good the performance of a CPU is and how many tasks can be processed in one second, we can think that it is irrelevant, although this is not the case. When evaluating the CPU load, the length of the task queue is measured in 5 seconds. If the task queue length is 1 every five seconds, the CPU load is 1. If we only have one single-core CPU and the load is always 1, it means that no tasks are waiting in the queue, which is not bad. As mentioned above, my server is dual-core and CPU, so there are four kernels. If the load of each kernel is 1, the total load is 4. That is to say, if the CPU load on my server remains around 4 for a long time, it is acceptable. But in fact, the CPU load has reached more than 9, so it is very troublesome. But the load on each kernel is 1, which is not an ideal state! This means that our CPU has been very busy and should not be idle. On the Internet, it is said that the ideal state is that the load of each kernel is about 0.7. I agree that the ideal CPU load of the server is obtained by multiplying 0.7 by the number of kernels, such as my server, the load is less than 3.0. VII. The following logical CPU descriptions are all from the Internet: Current servers generally use Hyper-Threading (HT) technology to improve CPU performance. Hyper-threading technology is used to execute multiple programs at the same time on a single CPU and share resources in a single CPU. Theoretically, we need to execute two threads at the same time like two CPUs. Although hyper-threading technology can be used to execute two threads at the same time, it is not like two real CPUs, each CPU has its own resources. When both threads need a certain resource at the same time, one of them must be temporarily stopped and the resources must be made available until these resources are idle. Therefore, the performance of hyper-threading is not equal to the performance of two CPUs. CPU with hyper-Threading Technology has other restrictions.