Uptime details, the most common explanation of the average CPU load

Source: Internet
Author: User

This command can be used in two major scenarios: Check whether your machine has been restarted recently, or whether it has been restarted due to hardware or other reasons, and check whether your CPU load is working properly?
Uptime
10:19:04 up 257 days, 12 users, load average: 2.10, 2.10, 2.09

1. 10:19:04 // current system time
2. Up 257 days, // the host has run time. The larger the time, the more stable your machine is.
3. 12 User // number of user connections, which is the total number of connections rather than the number of users
4. Load average // average system load, which counts the average system load for the last 1, 5, and 15 minutes

The first three items are easy to understand. For the fourth item, find an article on the Internet that is easy to understand.
Many people will understand the average load as follows: three numbers represent the average load of the system in different time periods (one minute, five minutes, and fifteen minutes). The smaller the number, the better. The higher the number, the higher the server load, which may be a signal of some problems on the server.

This is not exactly the case. What factors constitute the average load size and how can we tell whether the current conditions are "good" or "bad "? When should I pay attention to the abnormal values?

Before answering these questions, you must first understand the knowledge behind these values. The simplest example is to describe a server with only one single-core processor.

A single-core processor can look like a single lane. Imagine that you now need to charge the bridge fee for this road-if you are busy dealing with the vehicles that will bridge the road. First of all, you need to know more information, such as the load of a vehicle and how many vehicles are waiting for crossing the bridge. If no vehicle is waiting, you can tell the driver to pass. If there are a large number of vehicles, You need to inform them that it may take a while.


Therefore, some specific codes are required to indicate the current traffic conditions, such:

• 0.00 indicates that there is no traffic flow on the current bridge deck. In fact, this situation is the same as that between 0.00 and 1.00. In short, it is very smooth and vehicles in the past do not have to wait for the vehicle to pass.

• 1.00 indicates that it is within the bearing range of the bridge. This is not a bad situation, but traffic may be blocked, but this may cause slower and slower traffic.
• If the traffic exceeds 1.00, it indicates that the bridge has already exceeded the load and the traffic is heavily congested. How bad is the situation? For example, the 2.00 case indicates that the traffic flow has already doubled beyond the bridge's capacity, so there will be vehicles that are twice as busy as they are waiting. 3.00 is even worse, it means that the bridge is basically unable to handle it, and there are more than twice the load of the bridge.


The above situation is very similar to the processor load. The bridge time of a car is like the actual time when the processor processes a thread. The UNIX system defines the process running duration as the processing time of all processor kernels plus the waiting time of threads in the queue.

Like the administrator who receives the bridge fee, you certainly hope your car will not be waiting anxiously. Therefore, ideally, the average load is less than 1.00. Of course, it is not ruled out that part of the peak value will exceed 1.00. However, if you keep this status for a long time, it means there will be problems. At this time, you should be very anxious.

  "So what do you say is the ideal load of 1.00 ?"

Well, this is not exactly the case. Load 1.00 indicates that the system has no remaining resources. In actual situations, experienced System Administrators place this line at 0.70:

• "Requires Investigation Rules": if your system load is around 0.70 for a long time, you need to spend some time understanding the cause before it gets worse.

• "Repair now": 1.00. If your server system load lingers at 1.00 for a long time, you should solve this problem immediately. Otherwise, you will receive a call from your boss in the middle of the night. This is not a pleasant task.

• "Exercise at half past three": 5.00. If your server load exceeds 5.00, you will lose your sleep, and you have to explain the cause in the meeting. In short, never let it happen.

So what about multiple processors? My mean value is 3.00, but the system is running normally!

Wow, you have four processor hosts? Therefore, the average load is 3.00.
In a multi-processor system, the average load is determined by the number of kernels. In 100% load computing, 1.00 represents a single processor, while 2.00 represents two dual processors, so 4.00 indicates that the host has four processors.

Return to our metaphor for crossing the bridge. 1.00 I said it was a "one-lane road ". In the 1.00 case of a single lane, it indicates that the bridge has been filled with cars. In a dual-processor system, this means that the load is doubled, that is to say, there are still 50% of the remaining system resources-because there are other lanes available.


Therefore, when a single processor is under load, the full load of the dual processor is 2.00, and it has twice the resource available.

Multi-core and multi-processor

Let's take a look at the differences between a multi-core processor and a multi-processor. From a performance perspective, a host has a multi-core processor with the same number of processing performance as another host. Of course, the actual situation will be much more complicated. Different Quantities of cache, processor frequency and other factors may cause performance differences.

However, even if the actual performance of these factors is slightly different, the system still calculates the average load based on the number of cores of the processor. This gives us two new rules:

• "How many cores is the load" rule: in multi-core processing, your system average value should not be higher than the total number of processor cores.
• "Core" principle: Distribution of core is not important in several physical processes. In fact, two quad-core processors are equal to four dual-core processors and eight single-core processors. Therefore, it should have eight processor kernels.


Let's take a look at the uptime output.
Uptime up 14 days, 7 users, load averages: 0.65 0.42 0.36

This is a dual-core processor. The result also shows that there are a lot of idle resources. The actual situation is that even if its peak value reaches 1.7, I have never considered its load problems.

So how can there be three numbers that are really disturbing. We know that 0.65, 0.42, and 0.36 respectively indicate the average system load for the last minute, the last five minutes, and the last fifteen minutes. This raises another problem:

Which number shall we use? One minute? Five minutes? Or 15 minutes?

In fact, we have talked a lot about these numbers. I think you should focus on the average value of five or fifteen minutes. Frankly speaking, if the load in the previous minute is 1.00, it can still indicate that the server is still normal. However, if the value remains at 1.00 in 15 minutes, it is worth noting (based on my experience, you should increase the number of processors at this time ).

So how do I know how many core processors my system is equipped?

In Linux, you can use

CAT/proc/cpuinfo
Obtain information about each processor on your system. If you only want to get a number and check the number of CPUs, use the following command:

Grep 'model name'/proc/cpuinfo | WC-l

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.