Var/log/kern.log out of Memory:kill process <PID> (Java) score <SCORE> or sacrifice child message

Source: Internet
Author: User

Out of Memory:kill process or sacrifice Childjune 5, byJaan Angerpikk Filed under: Memory Leaks

It is 6 AM. I am awake summarizing the sequence of events leading to my way-too-early wake on call. As those stories start, my phone alarm went off. Sleepy and grumpy me checked the phone to see whether I am really crazy enough to set the wake-up alarm at 5AM. No, it was we monitoring system indicating that one of the plumbrservices went down.

As a seasoned veteran in the domain, I made the first correct step towards solution by turning on the espresso machine. With a cup of coffee I is equipped to tackle the problems. First suspect, application itself seemed to has behave completely normal before the crash. No errors, no warning signs, no trace of any suspects in the application logs.

The monitoring we have in place had noticed the death of the process and had already restarted the crashed service. But as I already had caffeine in my bloodstream, I started to gather more evidence. Minutes later I found myself staring at the following in the/var/log/kern.log :

Jun  4 07:41:59 plumbr kernel: [70667120.897649] Out of memory: Kill process 29957 (java) score 366 or sacrifice childJun  4 07:41:59 plumbr kernel: [70667120.897701] Killed process 29957 (java) total-vm:2532680kB, anon-rss:1416508kB, file-rss:0kB

Apparently we became victims of the Linux kernel internals. As you all know, the Linux is built with a bunch of unholy creatures (called 'daemons'). Those daemons is shepherded by several kernel jobs, one of the which seems to be especially sinister entity. Apparently all modern Linux kernels has a built-in mechanism called "out ofMemory killer" which can annihilate Your processes under extremely low memory conditions. When such a condition are detected, the killer is activated and picks a process to kill. The target is picked using a set of heuristics scoring all processes and selecting the one with the worst score to kill.

Understanding the "Out of Memory killer"

By default, Linux kernels allow processes to request more memory than currently available in the system. This makes all the sense in the world, considering that's most of the processes never actually use all of the memory they al Locate. The easiest comparison to this approach would is with the cable operators. They sell all the consumers a 100Mbit download promise, far exceeding the actual bandwidth present in their network. The bet is again on the fact that the users would not simultaneously all use their allocated download limit. Thus one 10Gbit link can successfully serve to more than the users we simple math would permit.

A side effect of such approach is visible in case some of your programs was on the path of depleting the system ' s memory. This can leads to the extremely low memory conditions, where no pages can is allocated to process. You might has faced such situation, where not even a root account cannot kill the offending task. To prevent such situations, the killer activates, and identifies the process to be the killed.

You can read more about fine-tuning the behaviour of ' out ofmemory Killer' from this article in RedHat documenta tion.

Did you know this 20% of Java applications have memory leaks? Don ' t kill your application–instead find and fix leaks with plumbr in minutes.

What is triggering the out of memory killer?

Now, we have the context, it's still unclear what's triggering the "killer" and woke me up at 5AM? Some more investigation revealed that:

    • The configuration in /proc/sys/vm/overcommit_memory allowed Overcommitting Memory–it is set to 1, indicating t Hat every malloc () should succeed.
    • The application is running on a EC2 m1.small instance. EC2 instances has disabled swapping by default.

Those facts combined with the sudden spike in traffic in our services resulted in the application requesting more and More memory-to-support those extra users. Overcommitting configuration allowed to allocate + and more memory for this greedy process, eventually triggering the " Out of the memory killer"who were doing exactly what it's meant to do. Killing our application and waking me up in the middle of the night.

Example

When I described the behaviour to engineers, one of the them is interested enough to create a small test case reproducing the Error. When your compile and launch the following Java code snippet on Linux (I used the latest stable Ubuntu version):

PackageEu.Plumbr.Demo;Public ClassOOM{Public Static voidMain(String[]Args){Java.Util.List<Int[]>L= NewJava.Util.ArrayList();For (IntI= 10000;I< 100000;I++) {Try {l. (new int[ 100_000_000); } catch  (throwable T)  {t.}}}< Span class= "pun" >               

Then you'll face the very same out of Memory:kill process <PID> (Java) score <SCORE> or sacrifice child message.

Note that you might need to tweak the swapfile and heap sizes, in my testcase I used the 2g heap specified via -xmx2g< /c1> and following configuration for swap:

swapoff -a dd if=/dev/zero of=swapfile bs=1024 count=655360mkswap swapfileswapon swapfile
Solution?

There is several ways to handle such situation. In our example, we just migrated the system to a instance with more memory. I also considered allowing swapping, but after consulting with engineering I were reminded of the fact that garbage collect Ion processes on JVM is not good at operating under swapping, so this option is off the table.

Other possibilities would involve fine-tuning the OOM killer, scaling the load horizontally across several small instances or reducing the memory requirements of the application.

If you found the study Interesting–follow plumbr on Twitter or RSS, we keep publishing our insights about Java internals .

Var/log/kern.log out of Memory:kill process <PID> (Java) score <SCORE> or sacrifice child message

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.