Linux OOM Killer mechanism

Source: Internet
Author: User
Tags system log

The Out of memory (OOM) killer feature in Linux is a final means of ensuring that the system's memory is sufficient, and after exhausting the system memory or swap area, the process that consumes the system's maximum resources can be judged by an algorithm, sending a signal to the process and forcing the process to terminate.

In short, the mechanism monitors those processes that consume too much memory, especially those that quickly consume a lot of memory, and the kernel kills the process in order to prevent the memory from running out.

This feature, even when memory is not available, can be repeated to ensure memory processing, prevent system stagnation, and identify excessive memory consumption processes.


Typical situation is: one day a machine suddenly ssh telnet, but can ping, that is not the network fault or the machine down, a large likelihood is the sshd process was OOM killer killed.

Viewing the system log after restarting the machine/var/log/messages will find an out-of-Memory:kill process 247 (SSHD) similar error message.
There is another situation can also cause the ping cannot ssh, is the network connection too much to the system file descriptor resource exhaustion, here for the time being not considered this situation.

In high-availability programs that use VIPs, this is also prone to brain fissures.


Prevents the important system process triggering (OOM) mechanism from being killed: You can set the parameter/proc/pid/oom_adj to 17, which temporarily shuts down the oom mechanism of the Linux kernel. The kernel calculates a score for each process using a specific algorithm to determine which process to kill, and the Oom score for each process can be found in/proc/pid/oom_score.

We think that the important process has sshd, or some monitoring daemon, you can choose the process to protect according to their actual situation.

Protecting a process from being killed by the kernel can be done like this:

ECHO-17 >/proc/$PID/oom_adj

Can write a simple script that deploys on crontab to prevent important processes from being oom

Pgrep-f "/usr/sbin/sshd" | While read Pid;do echo-17 >/proc/$PID/oom_adj;done

The "/usr/sbin/sshd" can be replaced by the process you think is important, but be careful not to match the wrong one.


    • Selected methods of the process
OOM killer when memory is exhausted, it looks at all processes and calculates scores for each process separately. Sends the signal to the process with the highest score.
Methods for calculating fractions
There are many things to consider when calculating scores in Oom killer. The first thing to do is to confirm the following 1~9 items for each process before calculating the score.
1. First, when the score is calculated based on the virtual memory size of the process, the virtual memory size can be confirmed using the vmsize of the PS command or/proc/<pid>/status. For the process that is consuming virtual memory, its initial score is higher, the unit is the process of 1KB as 1 score, consumes 1GB memory, the score is about 1024*1024.
2. If the process is executing a swapoff system call, the score is set to the maximum value (the maximum value of unsigned long). This is because disabling swap is the opposite of eliminating out-of-memory, and it is immediately used as an Oom killer object process.
3. If it is a parent process, half of the memory size of all child processes is used as a fraction.
4. Adjust the score based on the CPU usage time of the process and the process start time, because it is considered that the more important the process is for the longer running or more work, the lower the score needs to be kept.
5. Double the score for processes with lower priority, such as the Nice command. Double the score of the command set to 1~19 in Nice-n.
6. The privileged process is generally more important, so its score is set to 1/4.
7. The process of setting the function (capability) Cap_sys_rawio Note 3 via Capset (3), which has a score of 1/4, will be judged as an important process by the process of directly operating the hardware.
8. With regard to Cgroup, if the process allows only memory nodes that are completely different from the memory nodes that are allowed by the process that makes the Oom killer run, its score is 1/8.
9. Finally, the score is adjusted by the value of the proc file system Oom_adj.
According to the above rules, for all processes to score, to the highest scoring process to send a signal sigkill (to the Linux 2.6.10, in the case of setting the function Cap_sys_rawio, send sigterm, without setting, send Sigkill).
The scores for each process can be confirmed using/proc/<pid>/oom_score.
However, the Init (PID 1) process cannot be an Oom killer object. When the process that becomes an object contains child processes, the signal is sent to its child processes first.
After sending a signal to a process that is an object, the process of referencing the system, even if the thread group (TGID) is different, sends a signal to those processes if there are processes that share the same memory space as the object process.


As for why 17 is used instead of other values (the default is 0), this is defined by the Linux kernel, and the kernel source code is known:
To linux-3.3.6 version of the kernel source code for example, the path is linux-3.6.6/include/linux/oom.h, read the kernel source Oom_adj can be nice value 15 to 16, of which 15 the largest-16 min,- 17 The use of oom is prohibited. Oom_score is calculated for the N-time of 2, where n is the Oom_adj value of the process, so the higher the Oom_score score, the higher the kernel will kill.

Of course, you can also disable the oom mechanism by modifying kernel parameters

# sysctl-w Vm.panic_on_oom=1vm.panic_on_oom = 1//1 indicates off, default is 0 to turn on oom# sysctl-p


    • Test program

Command line parameter input takes up memory size n, set according to the physical memory size of the experiment environment, for example, my lab environment is memory 4G, set to 4G is enough

The code is named Mem.c, and the compilation method Gcc-o mem mem.c

#include <stdio.h> #include <stdlib.h> #include <string.h> #define PAGE_SZ (1<<12) int main (int ARGC, char* argv[]) {    int i;    if (argc! = 2) return 0;    int GB = Atoi (argv[1]);    for (i = 0; i < ((unsigned long) gb<<30)/PAGE_SZ; ++i) {        void *m = malloc (PAGE_SZ);        if (!m) break            ;        memset (M, 0, 1);    }    printf ("Allocated%lu mb\n", ((unsigned long) I*PAGE_SZ) >>20);    GetChar ();    return 0;}

Then execute./MEM 4

If you don't do anything, running the results directly will find the system automatically oom the process.


If we do the following, set the process priority to-17

Pgrep-f "Mem" | while read PID; Do echo-17 >/proc/$PID/oom_adj;done

You will find that the system does not get out of the process of consuming large memory, but you will also find that the system response slows down or even goes down!


    • Set any process to trigger oom

One of the simplest tests to trigger Oom is to set the Oom_adj of a process to 15 (maximum), which is most likely to trigger. Then execute the following command:


If you want to experiment in the test environment, directly to the online environmental operation caused any adverse consequences do not blame bloggers.



Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Linux OOM Killer mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.