Recently, a VPS customer complained that MySQL had crashed for no reason, and another customer complained that VPS often crashed. It was a common Outofmemory problem to log on to the terminal. This is usually because the system memory is insufficient due to a large number of requests from the application at a certain time, which usually triggers the OutofMemory (OOM) killer in the Linux kernel.
Recently, a VPS customer complained that MySQL had been suspended for no reason, and another customer complained that VPS often crashed. It was common Out of memory problems to log on to the terminal. This is usually because the system Memory is insufficient due to a large number of Memory requests from applications at a certain time, which usually triggers Out of Memory (OOM) killer in the Linux kernel.
Recently, a VPS customer complained that MySQL had been suspended for no reason, and another customer complained that VPS often crashed. It was common Out of memory problems to log on to the terminal. This is usually because the system Memory is insufficient due to a large number of Memory requests from applications at a certain time, which usually triggers Out of Memory (OOM) killer in the Linux kernel, OOM killer will kill a process to free up memory for the system, instead of causing the system to crash immediately. If you check related log files (/var/log/messages), the following Out of memory: Kill process information is displayed:
...Out of memory: Kill process 9682 (mysqld) score 9 or sacrifice childKilled process 9682, UID 27, (mysqld) total-vm:47388kB, anon-rss:3744kB, file-rss:80kBhttpd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0httpd cpuset=/ mems_allowed=0Pid: 8911, comm: httpd Not tainted 2.6.32-279.1.1.el6.i686 #1...21556 total pagecache pages21049 pages in swap cacheSwap cache stats: add 12819103, delete 12798054, find 3188096/4634617Free swap = 0kBTotal swap = 524280kB131071 pages RAM0 pages HighMem3673 pages reserved67960 pages shared124940 pages non-shared
The Linux kernel allocates memory according to application requirements. Generally, the application allocates memory but not all of them are actually used. To improve performance, this part of useless memory can be reserved for use, this part of memory belongs to every process. It is troublesome to directly recycle and use the kernel. Therefore, the kernel uses an over-commit memory method) to indirectly use this part of "idle" memory to improve the overall memory usage efficiency. In general, this is no problem, but it is troublesome when most applications consume their own memory, because the memory requirements of these applications exceed the physical memory (including swap) the kernel (OOM killer) must kill some processes to free up space to ensure the normal operation of the system. In the example of a bank, it may be easier to understand. Some people are not afraid of the bank when taking the money, and the bank has enough deposits to pay for it. When people across the country (or the vast majority) when everyone wants to get their money, the Bank is in trouble. The bank actually does not have that much money for everyone.
When the kernel detects that the system memory is insufficient and a process is selected and killed, refer to the kernel source code linux/mm/oom_kill.c. When the system memory is insufficient, out_of_memory () is triggered, then we call select_bad_process () to select a "bad" process to kill. How can we determine and select a "bad" process? The selection process is determined by oom_badness (). The Selection Algorithm and method are very simple: the worst process is the process that occupies the most memory.
/** * oom_badness - heuristic function to determine which candidate task to kill * @p: task struct of which task we should calculate * @totalpages: total present RAM allowed for page allocation * * The heuristic for determining which task to kill is made to be as simple and * predictable as possible. The goal is to return the highest value for the * task consuming the most memory to avoid subsequent oom failures. */unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg, const nodemask_t *nodemask, unsigned long totalpages){long points;long adj;if (oom_unkillable_task(p, memcg, nodemask))return 0;p = find_lock_task_mm(p);if (!p)return 0;adj = (long)p->signal->oom_score_adj;if (adj == OOM_SCORE_ADJ_MIN) {task_unlock(p);return 0;}/* * The baseline for the badness score is the proportion of RAM that each * task's rss, pagetable and swap space use. */points = get_mm_rss(p->mm) + p->mm->nr_ptes + get_mm_counter(p->mm, MM_SWAPENTS);task_unlock(p);/* * Root processes get 3% bonus, just like the __vm_enough_memory() * implementation used by LSMs. */if (has_capability_noaudit(p, CAP_SYS_ADMIN))adj -= 30;/* Normalize to oom_score_adj units */adj *= totalpages / 1000;points += adj;/* * Never return 0 for an eligible task regardless of the root bonus and * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here). */return points > 0 ? points : 1;}
The comments in the above Code are clearly written. After understanding this algorithm, we can understand why MySQL can be shot while lying down, because it is always the largest volume (generally it occupies the most memory on the system), if Out of Memeory (OOM), it is always killed first. The simplest way to solve this problem is to increase the memory or try to optimize MySQL to occupy less memory. In addition to optimizing MySQL, you can also optimize the system (optimize Debian 5 and CentOS 5.x ), so that the system can use as little memory as possible so that applications (such as MySQL) can use more memory, there is also a temporary way to adjust the kernel parameters, making MySQL processes hard to be discovered by OOM killer.
Configure OOM killer
We can adjust the OOM killer behavior through some kernel parameters to prevent the system from continuously killing the process. For example, the kernel panic can be triggered immediately after OOM is triggered, and the system is automatically restarted 10 seconds later.
# sysctl -w vm.panic_on_oom=1vm.panic_on_oom = 1# sysctl -w kernel.panic=10kernel.panic = 10# echo "vm.panic_on_oom=1" >> /etc/sysctl.conf# echo "kernel.panic=10" >> /etc/sysctl.conf
From the above oom_kill.c code, we can see that oom_badness () rates each process and determines which process to kill based on the level of points. This points can be adjusted according to the adj, root-permission processes are generally considered very important and should not be killed easily. Therefore, you can get a 3% discount (adj-= 30; the lower the score, the less likely it is to be killed ). In the user space, we can operate on the oom_adj kernel parameters of each process to determine which processes are not so easy to be selected and killed by OOM killer. For example, if you do not want the MySQL process to be killed easily, you can find the MySQL running process number and adjust oom_score_adj to-15 (note that the smaller the points, the less likely it is to be killed ):
# ps aux | grep mysqldmysql 2196 1.6 2.1 623800 44876 ? Ssl 09:42 0:00 /usr/sbin/mysqld# cat /proc/2196/oom_score_adj0# echo -15 > /proc/2196/oom_score_adj
Of course, OOM killer can be completely disabled if needed (not recommended in the production environment ):
# sysctl -w vm.overcommit_memory=2# echo "vm.overcommit_memory=2" >> /etc/sysctl.conf
Find out the process most likely to be killed by OOM Killer
We know that in the user space, you can adjust the score of a process by operating the oom_adj Kernel Parameter of each process. The score can also be seen through the oom_score Kernel Parameter. For example, you can view the omm_score with process number 981, after this score is adjusted (-15) by the parameter omm_score_adj mentioned above, it becomes 3:
# cat /proc/981/oom_score18# echo -15 > /proc/981/oom_score_adj# cat /proc/981/oom_score3
The following bash script can be used to print the process with the highest oom_score (most likely killed by OOM Killer) on the current system:
# vi oomscore.sh#!/bin/bashfor proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do printf "%2d %5d %s\n" \ "$(cat $proc/oom_score)" \ "$(basename $proc)" \ "$(cat $proc/cmdline | tr '\0' ' ' | head -c 50)"done 2>/dev/null | sort -nr | head -n 10# chmod +x oomscore.sh# ./oomscore.sh18 981 /usr/sbin/mysqld 4 31359 -bash 4 31056 -bash 1 31358 sshd: root@pts/6 1 31244 sshd: vpsee [priv] 1 31159 -bash 1 31158 sudo -i 1 31055 sshd: root@pts/3 1 30912 sshd: vpsee [priv] 1 29547 /usr/sbin/sshd -D
Original article address: Understanding and configuring OOM Killer in Linux. Thank you for sharing it with me.