Recently, a VPS customer complained that MySQL hangs for no reason, and a customer complained that the VPS often crashed, landing to the terminal to look at, are common out of the memory problem. This is usually due to the fact that a large amount of memory is being requested by the application to cause the system to run out of memory, which usually triggers the out of memory (OOM) Killer,oom killer in the Linux kernel to kill a process to make up for the system, without immediately crashing the system. If you examine the associated log file (/var/log/messages), you will see a similar out of Memory:kill process information as follows:
... Out of Memory:kill process 9682 (MYSQLD) score 9 or sacrifice childkilled process 9682, UID, (mysqld) total-vm:47388kb , ANON-RSS:3744KB, file-rss:80kbhttpd invoked Oom-killer:gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0httpd cpuset=/mems_allowed=0pid:8911, comm:httpd not tainted 2.6.32-279.1.1.el6.i686 #1 ... 21556 total Pagecache pages21049 pa GES in swap cacheswap cache stats:add 12819103, delete 12798054, find 3188096/4634617free swap = 0kBTotal swap = 5242 80kb131071 pages RAM0 pages HighMem3673 pages reserved67960 pages shared124940 pages non-shared
The Linux kernel allocates memory according to the requirements of the application, usually the application allocates memory but is not actually all used, in order to improve performance, this part of the useless memory can be reserved for its use, this part of the memory belongs to each process, the kernel is directly recycled, it is more troublesome, So the kernel uses an over-allocating memory (Over-commit memories) method to indirectly utilize this part of "idle" memory to improve the efficiency of overall memory usage. This is generally not a problem, but when most applications consume their own memory, the trouble comes when the memory requirements of these applications add up beyond the capacity of physical memory (including swap), and the Kernel (OOM killer) must kill some processes to make room for the system to function properly. In the case of banks, it may be easier to understand, some people take money when the bank is not afraid, the bank has enough to deal with, when the National people (or the vast majority of) to withdraw money and everyone wants to take their own money out of the bank when the trouble comes, the bank actually does not have so much money for everyone to take.
The
Kernel detects that the system is out of memory, picks up and kills a process, can refer to the kernel source code  LINUX/MM/OOM_KILL.C, Out_of_memory () is triggered when the system is low on memory, and then calls Select_bad_ Process () Select a "bad" processes to kill, how to judge and choose a "bad" process, can not be randomly selected it? The process of selection is determined by oom_badness (), and the algorithm and ideas chosen are simple and straightforward: the worst process is the most memory-intensive process.
/** * oom_badness-heuristic function to determine which candidate task to kill * @p:task struct of which task we should Calculate * @totalpages: Total present RAM allowed for page allocation * * The heuristic for determining which task to Ki ll is made to being as simple and * predictable as possible. The goal is to return the highest value for the * task consuming the most memory to avoid subsequent oom failures. */unsigned long oom_badness (struct task_struct *p, struct mem_cgroup *memcg, const nodemask_t *nodemask, unsigned long to Talpages) {long Points;long adj;if (Oom_unkillable_task (p, memcg, Nodemask)) return 0;p = FIND_LOCK_TASK_MM (P); if (!p) return 0;adj = (long) p->signal->oom_score_adj;if (adj = = oom_score_adj_min) {task_unlock (p); return 0;} /* * The baseline for the badness score is the proportion of RAM so each * task's RSS, pagetable and swap space use. */points = Get_mm_rss (p->mm) + p->mm->nr_ptes + get_mm_counter (p->mm, mm_swapents); Task_unlock (P);/* * RoOT processes get 3% bonus, just like the __vm_enough_memory () * implementation used by LSMS. */if (Has_capability_noaudit ( P, cap_sys_admin)) adj-= 30;/* Normalize to Oom_score_adj units */adj *= totalpages/1000;points + = adj;/* * never return 0 for a eligible task regardless of the root bonus and * OOM_SCORE_ADJ (Oom_score_adj can ' t is oom_score_adj_min here). */return points > 0? Points:1;}
The comments in the above code are very clear, understand this algorithm and we understand why MySQL can be shot in the lie, because it is always the largest volume (generally it occupies the most memory on the system), so if out of Memeory (OOM) is always unfortunate the first to be killed. The easiest way to solve this problem is to add memory, or try to optimize MySQL to use less memory, in addition to optimizing MySQL can also optimize the system (optimize Debian 5, optimize CentOS 5.x), so that the system uses as little memory as possible for applications (such as MySQL) Can use more memory, there is a temporary way to adjust the kernel parameters, so that the MySQL process is not easy to be discovered by OOM killer.
Configuring OOM Killer
We can adjust the behavior of the OOM killer by some kernel parameters to avoid the system's killing process there. For example, we can automatically restart the system after triggering the OOM immediately after triggering kernel Panic,kernel panic 10 seconds.
# sysctl-w vm.panic_on_oom=1vm.panic_on_oom = # sysctl-w kernel.panic=10kernel.panic = 10# echo "Vm.panic_on_oom=1" ;>/etc/sysctl.conf# echo "kernel.panic=10" >>/etc/sysctl.conf
From the above OOM_KILL.C code can see Oom_badness () to each process rate, according to the level of points to decide which process to kill, this points can be adjusted according to the Adj, root permission process is often considered important, should not be easily killed, so hit You can get a 3% discount on points (adj-= 30; The lower the score, the less likely it is to be killed. We can decide which processes are not so easily killed by Oom killer by manipulating the Oom_adj kernel parameters of each process in user space. For example, if you do not want the MySQL process to be easily killed, you can find the MySQL running process number, adjust Oom_score_adj to 15 (note that the smaller the points is not easy to kill):
# PS aux | grep mysqldmysql 2196 1.6 2.1 623800 44876? SSL 09:42 0:00/usr/sbin/mysqld# cat/proc/2196/oom_score_adj0# echo-15 >/proc/2196/oom_score_adj
Of course, you can completely turn off OOM killer (not recommended for production environments) if needed:
# sysctl-w vm.overcommit_memory=2# echo "vm.overcommit_memory=2" >>/etc/sysctl.conf
Find the process that most likely was killed by OOM Killer
We know that in user space you can adjust the score of the process by manipulating the Oom_adj kernel parameters of each process, and this score can also be seen by oom_score this kernel parameter, such as viewing the Omm_score with process number 981, which is mentioned above Omm_score_ After the adj parameter is adjusted (-15), it becomes 3:
# cat/proc/981/oom_score18# echo-15 >/proc/981/oom_score_adj# cat/proc/981/oom_score3
The following bash script can be used to print the process on the current system with the highest Oom_score score (most likely to be killed by Oom Killer):
# VI oomscore.sh#!/bin/bashfor proc in $ (find/proc-maxdepth 1-regex '/proc/[0-9]+ '); Do printf "%2d%5d%s\n" "$ ( cat $proc/oom_score)" "$ ( basename $proc)" "$ (cat $proc/cmdline | tr ' | HEAD-C) "Done 2>/dev/null | Sort-nr | Head-n 10# chmod +x oomscore.sh#./oomscore.sh18 981/usr/sbin/mysqld 4 31359-bash 4 31056-bash 1 31358 sshd: [Emai L PROTECTED]/6 1 31244 sshd:vpsee [priv] 1 31159-bash 1 31158 sudo-i 1 31055 sshd: [email protected]/3 1 30912 SSHD:VP See [Priv] 1 29547/usr/sbin/sshd-d
Understanding and configuring OOM Killer under Linux