NodeManager out of heap memory[fix bug whole process]

Source: Internet
Author: User

Problem:

I wrote a yarn on the application, found that nodemanager over time, will out of memory, the NodeManager heap memory from 1G to 2G is also unable to avoid the NM program Oom

JMX monitoring with NM enabled

-dcom.sun.management.jmxremote-dcom.sun.management.jmxremote.port=50079 - Dcom.sun.management.jmxremote.local.only=false -dcom.sun.management.jmxremote.authenticate=false -dcom.sun.management.jmxremote.ssl=false

Then connect with Jconsole, as shown in

Then the main memory is occupied in the old Gen, perform GC also has no effect, indicating that the memory is being referenced.

Wrote a crontab half an hour to generate a copy of the memory object data

*/18114 >/home/yarn/log/'date'. Log

The comparison found that the main memory growth in almost all of the [C this class, because the number of instances basically does not grow, just size growth, presumably because StringBuilder's constant append cause

:/home/yarn/log#diffmon\ aug\ -\Geneva\: -\: Geneva\ pdt\ -. log mon\ aug\ -\ +\: -\: on\ pdt\ -. Log4, 10C4,Ten<1:33014      312864176[C<2:90784       11817968<constMethodKlass><3:90784       11632640<methodKlass><4:7756        8908000<constantPoolKlass><5:7756        5780072<instanceKlassKlass><6:6493        4882976<constantPoolCacheKlass><7:21287        3447608[B--->1:33219      622929048[C>2:90897       11830824<constMethodKlass>>3:90897       11647104<methodKlass>>4:7807        8948632<constantPoolKlass>>5:7807        5810392<instanceKlassKlass>>6:6543        4913344<constantPoolCacheKlass>>7:21346        3470040[B

Using Java Tools

Jmap-dump:live,format=b,file=-j-xmx1024m [file]

It's not clear what you see.

The final plan is to use the mat to graphically analyze the problem, https://eclipse.org/mat/

Code to navigate to the Shell.java

Originally NodeManager this side to start a command, will always record the standard error output to a variable, this variable is not released during the program run, the GC can not reclaim space, after finding the problem, the solution is very simple. When you start a command, both the standard output and the error output are positioned to a file, and the NodeManager is not allowed to receive it. As follows

    // Add Log redirect params    Vargs.add ("1>>" + Applicationconstants.log_dir_expansion_var + "/"            + Voidboxconfiguration.voidbox_ Proxy_log_filename);    Vargs.add ("2>>" + Applicationconstants.log_dir_expansion_var + "/"            + voidboxconfiguration.voidbox_ Proxy_log_filename);

But this is a potential bug that needs fix, not because of some of the program's considerations, and affect its own stability.

Suggest can do rotate to errmsg.

NodeManager out of heap memory[fix bug whole process]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.