Erlang Memory Leak analysis

Source: Internet
Author: User

As projects become more dependent on Erlang, the problems they encounter increase. The system encountered high memory consumption problem in the previous time line, and recorded the analysis process of troubleshooting. The online system uses the Erlang R16B02 version.

Problem description

There are several online systems that run for some time and the memory soars. The system model is simple and has a network connection, and the pool is looking for a new process to process. The top command observes that the memory has been eaten by the Erlang process, netstat command to see the number of network connections, only a few k. The problem should be Erlang memory leaks.

Analysis method

The Erlang system has the advantage of being able to go directly to the online system and analyze the problem at the production site. Our system is managed through rebar and can be used in different ways to enter the online system.

Native Login

You can log in directly to the online machine and attach into the Erlang system with the following command

Through the remote shell

Get the Erlang system cookie

$ ps -ef |grep beam  %%找到参数 --setcookie

Open a new shell, use the same cookie, different nodename

$ erl --setcookie cookiename -name [email protected]

Enter the system with the start remote shell

Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:2:2] [async-threads:10] [hipe] [kernel-poll:false]Eshell V5.10.3  (abort with ^G)([email protected])1> net_adm:ping(‘[email protected]‘).pong([email protected])2> nodes().[‘[email protected]‘]([email protected])3> User switch command --> h  c [nn]            - connect to job  i [nn]            - interrupt job  k [nn]            - kill job  j                 - list all jobs  s [shell]         - start local shell  r [node [shell]]  - start remote shell  q                 - quit erlang  ? | h             - this message --> r ‘[email protected]‘ --> j   1  {shell,start,[init]}   2* {‘[email protected]‘,shell,start,[]} --> c 2
Analysis process

Erlang has many tools that can analyze system information, such as Appmon,webtool. But the system memory is seriously insufficient, there is no way to start these tools, fortunately, there are Erlang shell.

The Erlang shell comes with a lot of useful commands that can be viewed with the Help () method

> help().
Erlang system memory consumption

Top results show that it is a memory problem, so the first step is to look at the system memory consumption of Erlang first

> erlang:memory().

Memory () can see the RAM allocated by Erlang emulator, the total memory, the memory that atom consumes, the memory the process consumes, and so on.

Erlang Process Creation Quantity

The online system discovers that the main memory consumption is on the process, and the next analysis is whether the process memory leaks or the process creates too many quantities.

> erlang:system_info(process_limit).  %%查看系统最多能创建多少process> erlang:system_info(process_count).  %%当前系统创建了多少process

System_info () returns some information about the current system, such as the number of system Process,port. Execution of the above command, surprised, only 2,3k network connection, the result Erlang process already has more than 10 W. The system process was created, but the heap was not released because of code or other reasons.

View information for a single process

Since the process is piling up for some reason, it can only be found in the process.

To get the PID of the stacking process first

> i().  %%返回system信息> i(0,61,886).  %% (0,61,886)是pid

See a lot of process hang there, look at the specific PID information, found that message_queue several messages were not processed. The following is a powerful Erlang:process_info () method, which can get a fairly rich process of information.

> erlang:process_info(pid(0,61,886), current_stacktrace).> rp(erlang:process_info(pid(0,61,886), backtrace)).

When you view the backtrace of a process, the following information is found

0x00007fbd6f18dbf8 Return addr 0x00007fbff201aa00 (gen_event:rpc/2 + 96)y(0)     #Ref<0.0.2014.142287>y(1)     infinityy(2)     {sync_notify,{log,{lager_msg,[], ..........}}y(3)     <0.61.886>y(4)     <0.89.0>y(5)     []

Process is stuck while processing the log library lager for Erlang third-party.

Cause of the problem

View the lager documentation and find the following information

Prior to Lager 2.0, the gen_event at the core of lager operated purely in synchronous mode. Asynchronous mode is faster, but have has no protection against message queue overload. In Lager 2.0, the gen_event takes a hybrid approach. It polls its own mailbox size and toggles the messaging between synchronous and asynchronous depending on mailbox size.

{Async_threshold, +}, {Async_threshold_window, 5}

This would use async messaging until the mailbox exceeds 2 0 messages, at which point synchronous messaging would be used, and switch back to asynchronous, when size reduces to 20- 5 =.

If you wish to disable this behaviour, the simply set it to ' undefined '. It defaults to a low number to prevent the mailbox growing rapidly beyond the limit and causing problems. In general, lager should process messages as fast as they come in, so getting behind should is relatively exceptional a Nyway.

The original lager has a configuration item that configures the amount of message unhandled, and if the message is stacked more than one, it will be handled synchronously !

The current system has debug log turned on, and the flood of log has washed away the system.

Foreigners also encounter similar problems, this thread to our analysis brought a lot of help, thank you.

Summarize

Erlang provides a wealth of tools to enter the system online and analyze the problem on the spot, which helps to locate the problem efficiently and quickly. At the same time, the powerful Erlang OTP gives the system a more stable guarantee. We will continue to tap Erlang and look forward to more practice sharing.

About the author

Weibo @liaolinbo, chief engineer of Cloud Ba. Worked for Oracle.

Erlang Memory Leak analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.