Erlang process heap garbage collection mechanism

Source: Internet
Author: User

Original article: Erlang process heap garbage collection mechanism

By http://blog.csdn.net/mycwq

Each Erlang process is created with its own PCB, stack, and private stack. Erlang does not know the scenario in which the process he created will be used, so the memory allocated at the beginning is relatively small. If the allocated space is insufficient, erlang gc dynamically adjusts the heap size to meet the demand. If the allocated space is large, the heap size will be reduced and the memory will be reclaimed.

The gc of the erlang process heap is a generational gc. The idea of generational gc is based on statistics: the survival cycle of most data is relatively short, and the latest data is easier to use. Here erlang uses young heap and old heap to distinguish data. young heap puts new data, and old heap puts old data, that is, the surviving data after gc.

The erlang process heap gc has two main processes: shallow scan and deep scan.

Minor collection)

When the young heap space is insufficient, erlang performs a scanning on the young heap to copy useful data to the new young heap space, we found that the data that has been scanned more than once was put into the old heap, and then deleted the original young heap.

In young heap, erlang uses a high-level line to distinguish between data marked more than once and untagged data. Then, when young heap is moved into old heap, It is the data exceeding the High-level line.

Deep scan (major collection)

Deep scanning is generally triggered when the old heap space is insufficient. erlang scans young heap and old heap, puts useful data into the newly applied young heap, and deletes the original heap.

The deep scan trigger conditions include manual gc execution, and the number of gc times exceeds the fullsweep_after parameter limit.



Control garbage collection

Taking the game gateway process as an example, the gateway process usually has a large number of messages, and most messages are forwarded only in the gateway, so the gateway process can set a large initial memory, fast memory recovery.

Spawn_opt (Fun, [{min_heap_size, 5000}, {min_bin_vheap_size, 100000}, {fullsweep_after, 500}])

First, let's look at the default value of the parameter:
1> erlang: system_info (min_heap_size ).
{Min_heap_size, 233}
2> erlang: system_info (min_bin_vheap_size ).
{Min_bin_vheap_size, 46368}
3> erlang: system_info (fullsweep_after ).
{Fullsweep_after, 65535}

Min_heap_size is the minimum heap size of the process.

This parameter is used in two places. The first is the size of the erlang initialization process heap, the second is the minimum value maintained after the gc heap shrinks, And the min_bin_vheap_size is the minimum virtual binary heap size of the process, both parameters are in the unit of word. Large enough initial memory can be initialized to reduce the number of minor gc operations and reduce overhead for repeated application and memory recovery.

Fullsweep_after controls the frequency of deep Scanning

This parameter determines the number of gc operations and then executes a deep gc. The default value is 65536, which is a little big.

Therefore, the above three parameters work together to ensure that the process initialization allocates enough memory to reduce the overhead for repeatedly applying for memory. When the requested memory is insufficient, gc will re-apply for memory, gc is performed once for a total of 500 times.


Manual garbage collection

The following describes how to use fullsweep_after to control gc:
When you see this code in rabbitMQ, You can regularly execute this function in the project:
Gc ()->
[Erlang: garbage_collect (P) | P <-erlang: processes (),
{Status, waiting }== erlang: process_info (P, status)],
Erlang: garbage_collect (),
OK.
Of course, you can also add some judgments, such as specifying processes that occupy 50 MB of memory to execute gc

Memory occupied by erlang Process

Use the following method to check the memory occupied by the erlang process. You can try again with another parameter.
Fun = fun ()-> receive after infinity-> OK end.
Erlang: process_info (erlang: spawn (Fun), memory ).

Side effects of erlang garbage collection

As mentioned above, the gc of the erlang process heap is a generational gc, which is only at the global level. At the underlying level, erlang still follows the mark clearing path. The mark-clearing gc method is periodically executed. First, the gc is not timely enough. Second, the overhead during gc execution is large, causing interruption. However, the heap area of each erlang process is independent, and gc can be performed independently. In addition, the memory area is relatively small, and the erlang variable is assigned a value at a time without multiple tracing. Therefore, the gc delay of the erlang process will not cause global interruption.

Erlang document reference
GC in Erlang works independently on each Erlang process, I. e. each Erlang process has its own heap, and that heap is GCed independently of other processes 'heaps.
The current default GC is a "stop the world" generational mark-sweep collector. on Erlang systems running with multiple threads (the default on systems with more than one core), GC stops work on the Erlang process being GCed, but other Erlang processes on other OS threads within the same VM continue to run. the time the process was DS stopped is normally short because the size of one process 'heap is normally relatively small; much smaller than the combined size of all processes heaps.

Conclusion

Here we talk about the gc of the erlang process heap. In addition, there are other gc mechanisms in erlang. For example, the binary shared heap and off-process heap fragments refer to the counting gc. We will not discuss them here, I will discuss it in the next article. If you are interested, read it here.


Refer:

Http://blog.csdn.net/mycwq/article/details/26613275

Http://www.cnblogs.com/me-sa/archive/2011/11/13/erlang0014.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.