Here is a question, who is killed? Generally a little understanding of some Linux kernel students first reaction is who used the most, kill who. This is of course the Linux kernel first considered an important factor, but it is not exactly the case, we look at some of the kernel of Linux data, we can know actually kill who is determined by/proc/<pid>/oom_score, the value of each process one, is calculated by the oom_badness () function of the Linux kernel. Let's read the badness () function carefully.
In the annotation section of the badness () function, the processing idea of the badness () function is stated:
1) We lose the minimum amount of work done
2) We recover a large amount of memory
3) We don ' t kill anything innocent of eating tons of memory
4) We want to kill the minimum amount of processes (one)
5) We try to kill the process the user expects us to kill, this algorithm has been meticulously tuned to meet the Princip Le of least surprise ... (Be careful The IT)
In general it is killing the smallest number of processes to get the maximum amount of memory, which is consistent with the process that we kill to consume the most memory.
/*
* The memory size of the ' process is ' for the ' badness.
*/
Points = p->mm->total_vm;
The start of the score is the RAM memory that the process actually uses, note that swap is not included here, that is, the Oom killer is only related to the actual physical memory of the process, it has nothing to do with swap, and we can see that the more physical memory the process actually uses, the higher the score, and the more likely it is to be sacrificed.
/*
* Processes which fork a lot of child processes are likely
* A good choice. We Add the vmsize of the childs if they
* have an own mm. This is prevents forking servers to flood the
* Machine with a endless amount of childs
*/
...
if (chld->mm!= p->mm && chld->mm)
Points + + chld->mm->total_vm;
This paragraph indicates that the memory occupied by the child process is computed on the parent process.
s = int_sqrt (cpu_time);
if (s)
Points/= S;
s = int_sqrt (Int_sqrt (run_time));
if (s)
Points/= S;
This indicates that the longer the CPU time the process takes, or the longer the process runs, the lower the score, the less likely it is to be killed.
/*
* niced processes are most likely less, so double
* Their badness points.
*/
if (Task_nice (P) > 0)
Points *= 2;
If the process has a low priority (nice value, positive low priority, negative high priority), point doubles.
/*
* Superuser processes are usually more important, so we make it
* Less likely that we kill those.
*/
if (cap_t (p->cap_effective) & Cap_to_mask (cap_sys_admin) | |
P->uid = 0 | | P->euid = 0)
Points/= 4;
The super user has a lower process priority.
/*
* We don ' t want to kill a process with direct hardware access.
* Not only could this mess up the hardware, but usually users
* tend to only have this flag set on applications they
* of as important.
*/
if (cap_t (p->cap_effective) & Cap_to_mask (Cap_sys_rawio))
Points/= 4;
Processes that have direct access to the original device have higher priority.
/*
* Adjust the score by Oomkilladj.
*/
if (P->oomkilladj) {
if (P->oomkilladj > 0)
Points <<= p->oomkilladj;
Else
Points >>=-(P->OOMKILLADJ);
}
Each process has a OOMKILLADJ can set the priority of the process to be killed, this parameter seems to affect the point is still relatively large, oomkilladj maximum +15, the smallest is-17, the larger the more likely to be killed, this value because of the shift operation, so the impact is relatively large.
Here I write a small program to experiment:
#define MEGABYTE 1024*1024*1024
#include <stdio.h>
#include <string.h>
#include < stdlib.h>
int main (int argc, char *argv[])
{
void *myblock = NULL;
Myblock = (void *) malloc (megabyte);
printf ("Currently allocating 1gb\n");
Sleep (1);
int count = 0;
while (Count <)
{
memset (myblock,1,100*1024*1024);
Myblock = Myblock + 100*1024*1024;
count++;
printf ("Currently allocating%d00 mb\n", count);
Sleep (ten);
}
Exit (0);
}
The above program first requests a 1G of memory space, then 100M for the unit, fill the memory space. In a 2G memory, 400M swap space on the machine running 3 above the process. Let's look at the results of the operation:
Test1, Test2, test3 have applied for 1G of virtual memory space (Virt), and then every 10s, the actual amount of RAM space will increase 100M (RES).
When there is not enough physical memory, the OS starts to swap and the available swap space begins to decrease.
The test1 process is killed by the operating system when the memory is in the absence of an allocated space. DMESG we can see that the test1 process is killed by the OS and Oom_score is 1000.
The Oom_adj of these 3 processes are all default values of 0. Now let's experiment with setting the Oom_adj effect. Restart 3 processes, and then we see that the Test2 pid is 12640
Let's run the following statement
echo >/proc/12640/oom_adj
After a period of time, we see a sharp reduction in swap space, basically OS Oom_killer to be started.
Sure enough, no surprise, the 12640 process was killed.
So to avoid the process that you need to be killed, you can do this by setting the Oom_adj of the process. Of course, some people will say that all this is caused by the sale, since Linux provides overcommit_memory can disable the Overcommit feature, then why not disable it. This is advantageous also has the disadvantage, once disables the overcommit, means that the MySQL simply cannot request to exceed the actual memory the space, but in the MySQL, has many dynamic application memory space place, if the application is not, the MySQL will crash, this greatly increases the MySQL downtime the risk, That's why Linux is Overcommit.
With the above analysis, we can see that, if you do not set the premise of Oom_adj, MySQL will generally become Oom_killer's preferred object, because MySQL is generally the largest memory occupants. That as MySQL, how do we try to avoid the risk of being killed, the next chapter we will focus on the MySQL from the perspective of how to avoid oom.