Analyzing the average load concept of Linux systems

Source: Internet
Author: User
Tags pow time interval

One, what is the system average load (load average)?
In Linux systems, the uptime, W, top commands will have the system average load average output, then what is the system average load?
The system average load is defined as the average process tree running in the queue within a specific time interval. A process is located in the run queue if the following conditions are true:
-It is not waiting for the I/O operation results
-It does not actively enter the waiting state (that is, no ' wait ' is invoked)
-Not stopped (for example: waiting to terminate)
For example:

The code is as follows:
[Root@opendigest root]# Uptime

7:51pm up 2 days, 5:43, 2 users, load average:8.13, 5.90, 4.94

The final content of the command output represents the average number of processes running in the queue in the past 1, 5, and 15 minutes.

In general, as long as the current number of active processes per CPU is not greater than 3 then the performance of the system is good, if the number of tasks per CPU is greater than 5, then the performance of the machine is a serious problem. For the above example, assuming the system has two CPUs, the current number of tasks for each CPU is: 8.13/2=4.065. This means that the performance of the system is acceptable.

Two, the Load average algorithm
The output data above is the number of active processes checked every 5 seconds, and then calculated according to this number. If this number is divided by the number of CPUs, the result is higher than 5, indicating that the system is overloaded. The algorithm (which is extracted from Linux 2.4 kernel code) is as follows:

File: include/linux/sched.h:

The code is as follows:
#define FSHIFT/* NR to bits of precision * *
#define FIXED_1 (1< #define LOAD_FREQ (5*hz)/* 5 sec Intervals * *
#define EXP_1 1884/* 1/EXP (5sec/1min) as fixed-point, 2048/pow (EXP (1), 5.0/60) * *
#define EXP_5 2014/* 1/EXP (5sec/5min), 2048/pow (EXP (1), 5.0/300) * *
#define EXP_15 2037/* 1/EXP (5sec/15min), 2048/pow (EXP (1), 5.0/900) * *
#define CALC_LOAD (load,exp,n) \
Load *= exp; \
Load + n (fixed_1-exp); \
Load >>= fshift;

File: kernel/timer.c:

The code is as follows:
unsigned long avenrun[3];
static inline void Calc_load (unsigned long ticks)
{
unsigned long active_tasks; * fixed-point * *
static int count = Load_freq;
Count-= ticks;
if (Count < 0) {
Count + = Load_freq;
Active_tasks = Count_active_tasks ();
Calc_load (Avenrun[0], exp_1, active_tasks);
Calc_load (Avenrun[1], exp_5, active_tasks);
Calc_load (avenrun[2], exp_15, active_tasks);
}
}

File: fs/proc/proc_misc.c:

The code is as follows:
#define LOAD_INT (x) ((x) >> fshift)
#define LOAD_FRAC (x) Load_int (((x) & (Fixed_1-1)) * 100)
static int Loadavg_read_proc (char *page, Char **start, off_t off,
int count, int *eof, void *data)
{
int A, b, C;
int Len;
A = Avenrun[0] + (fixed_1/200);
b = avenrun[1] + (fixed_1/200);
c = avenrun[2] + (fixed_1/200);
len = sprintf (page, "%d.%0 2d%d.%0 2d%d.%0 2d%ld/%d%d\n ",
Load_int (a), Load_frac (a),
Load_int (b), Load_frac (b),
Load_int (c), Load_frac (c),
Nr_running (), nr_threads, last_pid);
return Proc_calc_metrics (page, start, off, Count, EOF, Len);
}

Iii./proc/loadavg The meaning of the data
The/proc file system is a virtual file system that does not consume disk space and reflects the current operating system in memory, viewing the files under/proc to be sent to the operating state of the system. View the system average load using the "CAT/PROC/LOADAVG" command, the output is as follows:
0.27 0.36 0.37 4/83 4828/
The first three numbers are known to be the average number of processes in 1, 5, and 15 minutes (some think it's the percentage of the system load, but it's not, sometimes you see 200 or more). In the back two, one molecule is the number of processes running, the denominator is the total number of processes, and the other is the most recently running process ID number.

Iv. common commands for viewing system average load
1,

The code is as follows:

Cat/proc/loadavg

2, uptime
Name: Uptime
Use Rights: All users
Use mode: uptime [-v]
Note: Uptime provides the following information to the user without additional parameters:
Now the time system turns on the number of users who are now passing through the last minute, five minutes and 15 minutes of system load
Parameters:-V Displays version information.
Example: Uptime
The results are:

The code is as follows:
10:41am up 5 days, Min, 1 users, Load average:0.00, 0.00, 1.99

3, W
Function Description: Displays the user information of the current login system.
Syntax: w [-fhlsuv][user name]
Supplementary note: The implementation of this directive will tell the users who are currently logged into the system and the procedures they are implementing. Separate execution W
Directive displays all users, you can also specify a user name and display only information about a user.
Parameters
-F turns on or off to show where the user is logged into the system.
-H does not display header information columns for each field.
-L uses the detailed format list, which is a preset value.
-S uses a concise format list, which does not display the user login time, the terminal stage job and the CPU time consumed by the program.
-U ignores the name of the executing program and the information that the program consumes CPU time.
-V Displays version information.
4, Top
Function Description: Display, manage the program in execution.
Syntax: Top [Bciqss][d < interval seconds >][n < execution times]
Supplemental Note: The Executive top command shows the program currently being executed in the system and is managed with a hotkey through the interactive interface it provides.
Parameters
b use Batch mode.
C listing the program, display the complete instructions for each program, including the name of the instruction, path and parameters and other related information.
d< interval seconds > set the time interval between the top monitor's execution status, measured in seconds.
I ignore idle or zombie programs when executing the top command.
n< execution times > set the number of updates for monitoring information.
Q Ongoing monitoring of program execution status.
s uses confidential mode to eliminate potential crises in the interactive model.
S uses the cumulative pattern, and its effect is similar to the "-S" parameter of the PS directive.
5, Tload
Feature Description: Displays system load status.
Syntax: tload [-v][-d < interval seconds >][-s < scale size >][terminal number]
Supplemental Note: The tload directive uses ASCII characters to simply display the system load state in text mode. If the terminal number is not given, the load situation will be displayed at the terminal that executes the tload instruction.
Parameters
-d< interval seconds > sets the interval between tload to detect system loads in seconds.
-s< tick size > Set the vertical scale of the chart, in columns.
-V Displays version information.

Iv. system average load-advanced interpretation
In order to better understand the system load, we use the traffic flow to do the analogy.

1, single core CPU-cycling track-The number between 0.00-1.00 Normal

Traffic Manager will inform the driver, if the front is more congested, the driver will wait, if the front all the way, then the driver can drive directly.

In specific terms:

The numbers between 0.00-1.00 indicate that the traffic is very good at this time, without congestion, the vehicle can pass without hindrance.

1.00 says the road is normal, but it could worsen and cause congestion. At this point, the system has no additional resources, the administrator needs to optimize.

1.00-*** said the road is not very good, if the arrival of 2.00 means that there is a bridge on the number of vehicles are waiting. You must check the situation.

2, multi-core CPU-multi-lane-digital/CPU Core number of 0.00-1.00 Normal

Multi-core CPU, the full load status of the number of "1.00 * CPU core", that is, dual-core CPU 2.00, quad-core CPU is 4.00.

3, the security system average load

The author thinks that single core load is safe under 0.7, and more than 0.7 needs to be optimized.

4, should see which number, 1 minutes, 5 minutes or 15 minutes?

The author thinks it's better to look at 5 minutes and 15 minutes, which is 2 digits behind.

5, how to know my CPU is a few cores?

The number of CPU cores can be obtained directly using the following commands

The code is as follows:

grep ' model name '/proc/cpuinfo | Wc-l

Conclusion

Get CPU Core number N, look at the following 2 digits, with the number/n, if the resulting value is less than 0.7 can worry-free.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.