Java automatic memory management

Last Update:2017-01-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I recently read two books on Java virtual machines and read the chapters on Java automatic memory management. They all write similar things. Here, I will summarize the three main aspects: memory Division, memory allocation, and memory recovery.

Memory Division (runtime data zone)

JVM runtime data zone

From the thread point of view, it can be divided into the thread private and thread sharing. The gray area on the left of the figure above is the thread sharing area, including the heap, method area, and runtime pool. The area on the right is private to the thread, including program counters and virtual machine stacks.

Heap

Heap is the largest part of the memory managed by virtual machines and is a shared area by threads. It is mainly used to store object instances, but not all objects are allocated on the heap. Heap is also the main area for managing the garbage collector.

Method area

Same as the heap, the method area is also the memory area shared by various threads. It is used to store data such as class information, constants, static variables, and code compiled by the real-time compiler. Although the method area is logically independent from the heap, it physically belongs to the heap.

Runtime constant pool

The runtime constant pool is part of the method area and is used to store the constant pool information in the class file. It is mainly used for various nominal values and symbol references. In addition, the runtime constant pool does not require constants to be generated only during the compilation period. During The runtime, new constants may also be placed in the pool, such as the intern () method of the String class.

Program counters

Similar to program counters in the operating system, but the program counters here indicate the address of the bytecode instruction being executed. After the bytecode interpreter executes a command, it changes the value of the program counter and points to the next instruction address. Every thread needs to use an independent program counter because it allows the multi-thread program to be correctly executed and the counters between each thread do not affect each other.

Virtual Machine stack

The virtual machine stack describes the memory model of Java method execution. Its basic unit is stack frame. A stack frame is created during each method execution. The virtual machine has been executing the method corresponding to the stack frame at the top of the stack. When one method calls another method, a new stack frame of the called method will be created and pushed to the virtual machine stack, when the invocation of the called method ends, the returned value is written to the stack frame that calls the method, and its stack frame is popped up from the stack.
The method stack frame stores information such as the local variable table, operand stack, dynamic link, and method exit. The memory space required by the local variable table can be determined by the compiler and used to store local variables in the method. The operand stack is used for operation and two operands are popped up from the top of the stack, the calculation result is pushed into the stack.

Another thing that is not mentioned is the local method stack, which is similar to the virtual machine stack, but serves the local method. The Virtual Machine Specification does not have a mandatory specification, which can be implemented by the virtual machine.

Memory allocation

Java heap generation division

As mentioned above, heap is the main area of memory management. It stores various objects and can be further divided into the new generation and old generation. In the new generation, Eden space, From region vor space, and To region vor space are available. This division is mainly used to facilitate memory recovery. The specific purpose of each space will be known after memory recycle.

Starting from a new object in Java code, JVM will first check that the parameter of this new instruction can locate the symbolic reference of a class in the constant pool, then, check whether the class corresponding to this symbolic reference has been successfully loaded, parsed, initialized, and other steps. After the class is loaded, you can determine the size of the space required to create such an instance. Then the JVM allocates memory for the instance.

Which of the following is the allocation problem. Generally, it is allocated to the Eden space in the heap. If a Local thread is started to allocate a Buffer, it will be allocated to the TLAB (Tread Local Allocation Buffer, that is, the Local thread allocates a Buffer) first, TLAB is the private part of the threads in the Eden space, which occupies about 1% of the total space of the Eden. If the Eden space fails to be allocated, a new generation of garbage collection will be performed. Large objects that require a large amount of continuous memory will be directly allocated to the old age.

Another concept involved is escape analysis. As mentioned above, not all objects are allocated in the heap. Some objects are allocated on the stack, the stack mentioned here refers to the local variable table section in the VM stack frame. Escape analysis is an analysis technology before JVM performance optimization. The goal is to analyze the scope of the object. If an object only acts inside the method body, it will allocate memory to it on the stack. The stack frame will be destroyed as the method exits, and it does not need to be involved in garbage collection. However, once the object inside the method is referenced by an external object, the object will escape and will not be allocated on the stack.

Memory Reclaim

Memory recovery involves several aspects: What memory needs to be recycled? When will it be recycled? How to recycle it?

Determination of recyclable objects

Commonly used algorithms include reference counting algorithms and root search algorithms.

The reference counter is to add a reference counter for each object. Whenever a reference counter is referenced in a certain place, the counter value is added to 1. When the reference fails, the counter value is reduced to 1. If the counter value is 0 at any time, the object is no longer used. The disadvantage of this method is that it is difficult to solve the issue of circular references between objects. If two objects to be recycled are referenced by each other, they cannot be recycled by the garbage collector.

The root search algorithm uses a series of objects named "GC Roots" as the starting point to start searching down from these nodes. The paths searched through are called reference chains, when an object is connected to GC Roots without any reference chain, it is proved that this object is unavailable. GC Roots objects include objects referenced in the local variable table in the stack frame, objects referenced by class static attributes in the method area, objects referenced by constants in the method area, and objects referenced in the local method stack.

Garbage collection algorithm

These algorithms include the mark-clearing algorithm, the copy algorithm, and the mark-compression algorithm.

Mark-clear algorithm: first, Mark all objects to be recycled. After marking, all marked objects are recycled. The process of marking and clearing is not efficient, and a large number of discontinuous memory fragments are generated after marking is cleared, too many fragments will lead to the failure to find enough continuous memory when large objects need to be allocated, and the other garbage collection operation has to be triggered in advance. Idle memory fragments can be expressed in the idle list to provide the memory address of the next allocated object.

Replication Algorithm: divides the memory into two equal parts. Each time, only one of them is used. When one piece of memory is used up, the remaining objects are copied to the other, then, the used memory space is cleared once. The operation is efficient, at the cost of a general loss of memory space. The replication algorithm is used in the new generation garbage collection algorithm in the heap. Copy the objects in the Eden space and From other vor space To another VOR space, and swap the From other vor space with the To other vor space. (If the generational age of the surviving objects in the Eden space and From Hangzhou vor space is greater than a certain threshold value or the space To Hangzhou VOR is full, the objects will be directly allocated To the old age) the default ratio of the Eden space to the two vor spaces is. The reason why the replication algorithm can be used in the new generation is that the life cycle of most new generation objects is very short.

Mark-sorting algorithm: similar to the mark-clearing algorithm. However, this algorithm moves all surviving objects to one end of the memory and directly clears the memory at the other end. This algorithm is applied to garbage collection in the old age. As we can sort out a large continuous idle memory area, we can use a pointer to point to the starting point of the idle memory area to point to the next memory allocation location.

Garbage collector

There are many garbage collectors, and many types of garbage collectors are integrated in the virtual machine. This article will not repeat them here. It is worth mentioning that the Stop-the-World mechanism, in general, when garbage collection is in progress, the worker thread must Stop for a period of time. No matter which collector is used for garbage collection, there will be more or less Stop-the-World time.

The other is the trade-off between program throughput and low latency. The so-called throughput is the ratio of the CPU time used to run user code to the total CPU consumption time. You can use-XX: MaxGcPauseMillis to set the Stop-the-World time caused by garbage collection. However, if this value is adjusted to reduce the latency, the new generation memory space will become smaller, the smaller the memory space, the more likely it will be to be exhausted, which will lead to more frequent GC. The total GC time may increase, resulting in a decrease in program throughput.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More