More and more people are getting started with Linux operating systems, from VPS to flash machine systems for wireless routing (such as OpenWRT and Tomato ), at the same time, the words "Average system Load" or "Load Average" will also be seen on a variety of probes and system monitoring interfaces, however, it does not show CPU and memory usage as a percentage provided by Windows and Mac operating systems, but uses a few floating-point numbers separated by spaces to represent the average system load, so what exactly do they mean? How can we measure the system load and system stability?
Average system load-basic explanation
In Linux shell, there are many commands to see Load Average, for example:
Root@Slyar.com :~ # Uptime
12:49:10 up 182 days, 2 users, load average: 0.08, 0.04, 0.01
Root@Slyar.com :~ # W
12:49:18 up 182 days, 2 users, load average: 0.11, 0.07, 0.01
Root@Slyar.com :~ # Top
Top-12:50:28 up 182 days, 2 users, load average: 0.02, 0.05, 0.00
First, give a rough description of the three numbers: the average number of processes in the process queue in the past 1 minute, 5 minutes, and 15 minutes respectively.
Run the queue. All processes without waiting for IO, WAIT, and KILL are in this queue.
There is also the most direct command to display the average load of the system.
Root@Slyar.com :~ # Cat/proc/loadavg
0.10 0.06 0.01 1/72 29632
In addition to the first three digits, they indicate the average number of processes. The following one score indicates the total number of system processes, and the numerator indicates the number of running processes; the last number indicates the ID of the recently running process.
Average system load-advanced explanation
The above sentence is basically not explained. The reason for writing this article is that I saw an article about Load Average written by a foreigner and thought it was a good explanation. So I decided to extract some of my words and translate them with my own words.
@ Scoutapp Thanks for your articleUnderstanding Linux CPU Load, I just translate and share it to Chinese audiences.
In order to better understand the system load, we use traffic for analogy.
1. Single-core CPU-single-lane-numbers between 0.00 and 1.00 are normal
The traffic administrator will inform the driver that if the front is congested, the driver will wait. If the front is smooth, the driver can drive directly.
Specifically:
The number between 0.00 and 1.00 indicates that the road condition is very good and there is no congestion, so the vehicle can pass through without hindrance.
1.00 indicates that the road is still normal, but may deteriorate and cause congestion. At this time, the system has no additional resources, and the administrator needs to optimize them.
1.00-*** indicates that the road condition is not good. If the road condition reaches 2.00, the target vehicle with a multiple of the vehicles on the bridge is waiting. In this case, you must check.
2. multi-core CPU-multi-lane-the number of digital/CPU cores must be between 0.00 and 1.00
For multi-core CPUs, the "1.00 * Number of CPU cores" indicates that the dual-core CPU is 2.00, And the quad-core CPU is 4.00.
3. Secure average system load
The author believes that a single-core load of less than 0.7 is safe, and more than 0.7 needs to be optimized.
4. Which number should I check? 1 minute, 5 minutes, or 15 minutes?
The author thinks that 5 minutes and 15 minutes are better, that is, the next two digits.
5. How do I know what cores my CPU is?
Use the following command to directly obtain the number of CPU Cores
Grep 'model name'/proc/cpuinfo | wc-l
Conclusion
Obtain the number of CPU cores N, observe the next two digits, and use the number/N. If the obtained value is less than 0.7, you don't have to worry about it.