Haven't written a blog for a long time, deeply ashamed, today talk about Java memory management
Brief introduction
One of the advantages of Java compared to traditional languages (c,c++) is its ability to automatically manage memory, stripping developers of managing memory tasks.
This article outlines how the JVM manages memory in J2SE 5.0 release. and for selecting and configuring the corresponding collector, the configuration collector parameters provide some suggestions and references.
Manual vs Automatic memory management
Memory management is the process of identifying which freed objects are no longer in use, freeing up space occupied by those objects. In many programming languages, memory management is the responsibility of the developer. However, the task of managing memory has some complexity, causing many errors, affecting the behavior of the application and causing the program to crash. As a result, a large part of the developer's time is to debug and fix these errors.
One of the frequently occurring problems in manual memory management is dangling references. It is possible that when the space occupied by an object is freed, there are other references to the object being destroyed, at which point the result of the operation is not expected when the reference points to the new object.
Another common problem is the space leaks. The reason for the leak is that when the memory is allocated, but it is not referenced, it cannot be released again. For example, if a developer tries to release a linked list, but the program has a small bug, and the value releases the head node, the object behind the list cannot be found. will be recycled again. Once the leak is too much, the entire memory crashes.
Instead of manually managing memory, in an object-oriented programming language, it is commonly used to automate the management of memory technologies, also known as garbage collectors. Automatic memory management provides a higher level of abstraction for interfaces.
The garbage collector solves the dangling reference problem, because if an object is also referenced, it is not reclaimed by the garbage collector. At the same time, the garbage collector solves the space leaks problem, because those leaked spaces, which belong to objects that are not referenced, are reclaimed by the garbage collector.
The concept of the garbage collector
The garbage collector has some responsibilities:
- Allocating memory
- Ensure that the referenced object is still in memory
- Freeing up the space occupied by those unreachable objects
A reference object is often referred to as a surviving object. Objects that are not referenced are often referred to as dead objects and are considered garbage. The process of retrieving and releasing dead objects is called garbage collection.
The garbage collector solves a lot of memory management problems, but not all of them. Of course, developers can constantly create objects and always keep a reference to them until there is no memory available. Garbage collection itself is a complex task that consumes considerable time and resources.
The algorithms for organizing objects, allocating and freeing space are handled by the garbage collector and are hidden outside the developer's sight. Space is usually freed from a large pool of memory, called a heap .
The garbage collection schedule is usually determined by the garbage collector itself. Generally, a subset of the entire heap or heap is garbage collected when it fills up or reaches a certain percentage threshold.
The assigned task contains a piece of unused memory found in the heap, and of course this task is not simple. The main problem of this algorithm for dynamic space allocation is to avoid fragmentation , as far as possible to ensure the allocation of space and free space efficient.
Satisfying garbage collection characteristics
The garbage collector must be secure and fully understand the code. It also means that the surviving data must not be released wrongly, and the garbage should not survive after several recycling cycles.
Of course, if the garbage collector is able to run efficiently, it will not be a long pause in the process of application execution, which is certainly very good. However, in most systems, it is often necessary to make a trade-off between space, time, and frequency. For example, if the heap space is small, garbage collection will be fast, but the heap will be more quickly filled with objects and will require more frequent garbage collection. Conversely, if the heap space is larger, the heap will take longer to fill, and garbage collection will not be performed very frequently, but a single garbage collector will take longer.
Garbage collection, if it can effectively limit the Shard, is undoubtedly very good. When the memory space occupied by some garbage objects is reclaimed, the free space may exist in the form of small chunks in multiple regions. When this happens, you may not be able to get enough space when you request space for a larger object again. A way to eliminate fragmentation is called memory crunch.
Extensibility is also required by the garbage collector. The allocation operation should not become a multi-process, multi-threaded application extensibility bottleneck, the collection operation should also not be a bottleneck.
Design options
When designing and choosing a garbage collection algorithm, it is often necessary to make some choices:
- Select serial or Parallel collection . When using serial recycling, only one thing happens per time node. For example, even when multiple CPUs are available, there is only one CPU to perform garbage collection operations. In the case of parallel collection, garbage collection operations are divided into sub-modules that are executed in parallel by different CPUs. Parallel operations can make recycling faster, but with higher complexity costs and potentially fragmented conditions.
- Parallel Recycle vs Global pause collection . When the Stop-the-world garbage collector executes, the execution of the application is completely paused at the time of the garbage collection. Of course, garbage collection operations and the processing of the application itself can also be performed concurrently. Typically, the concurrency collector completes the majority of the tasks in parallel, but there are still fewer instances of pausing the application. The Stop-the-world garbage collector is simpler than a concurrent collector. Because the heap is locked at collection time, the object will not be changed during this period. Of course, the downside is that some apps don't want the app to be paused. Accordingly, with the concurrent collector, the application pauses for a shorter time, but the collector has to take extra account of whether the update is performed when the application is using the object. This brings additional work for concurrent collection, which can have some performance impact when the heap is large.
- compression vs No compression vs copy . When the garbage collector determines that the objects in memory are alive, those are garbage, you can choose to compress the memory, collect the surviving objects together, and reuse the remaining space. After compression, it is easy to allocate space for new objects. You can use a pointer to track the end of an allocated object. The non-compression collection algorithm releases the location of the garbage object relative to the compression collection algorithm. However, there is no way to compress the surviving objects together, so it is not possible to set aside much space as a compression algorithm to use when new objects are allocated. The advantage of the non-compression algorithm is that garbage collection is fast, but the memory fragmentation problem is more serious. In general, the allocation cost of non-compression algorithm is higher than the compression algorithm. Because you have to search for a chunk of contiguous memory space that is large enough to give new objects. Another algorithm is the copy collection, which tells all surviving objects to be copied to a different memory area. The advantage is that the previously used memory area can be considered completely new. The disadvantage is that you need to copy the required memory space.
Metrics on performance
There are a few things to consider when considering the performance of the garbage collector:
- throughput : Refers to the percentage of time that is not used on garbage collection.
- garbage Collection Load : Is the opposite of the throughput, that is, garbage collection above the time accounted for.
- Pause Time : The time at which the app stops executing when garbage collection is performed.
- Collection Frequency : How often the collection is executed, and this value is usually related to the execution of the application.
- Space occupied : a measure of space occupancy, such as the size of a heap.
- swiftness : When an object becomes a garbage object and it occupies space at a time interval that is available.
Interactive applications require a lower pause time, while the total execution time is more than the non-interactive application requirements. Real-time applications have a higher demand for garbage collection pauses and time to garbage collection. In personal computers or embedded systems, the use of space may be an issue that should be considered by the application.
Collection of generations
When using generational collection techniques, memory is divided into generations, that is, to divide objects of different ages into different object pools. For example, the most commonly used configuration in Java is two different ages: the young generation, the old age, respectively, to store young objects and old age objects.
In each of the different generations, different garbage collection algorithms can be used, and each algorithm can be optimized in its own time based on the characteristics of that era. Each generation of garbage collectors has the following assumption, called weak generational hypothesis, that the applications implemented in most languages (including Java) have the following characteristics:
- Most allocated objects do not survive for long periods of time.
- A few long-lived objects will persist.
:
Garbage collection by younger generations is relatively more frequent and executes more quickly because younger generations tend to be smaller and reference many objects with short life cycles.
And some of the objects in a few young generations of recycling is not recycled, it will be promoted to the old age object. such as: the old age is usually larger than the younger generation, its occupation of the growth rate will become slower. Therefore, garbage collection will not be very frequent in the old age, but it will take longer to recover.
The garbage collection algorithm chosen for the younger generation usually takes precedence over the speed, as the recovery of younger generations is usually more frequent. On the other hand, the algorithms considered in the old age are usually more space-efficient, because older generations take up more space in the heap, and older algorithms require better processing of low-density garbage collection.
J2SE garbage collector in the JVM
The J2SE JVM contains four types of garbage collectors. All garbage collectors are generational. This section describes the generational and type of recycling, and discusses why objects are often efficient and fast in their spatial allocation. Detailed information is then provided for each of the garbage collectors.
Hotspot Generational
In the JVM, the memory is divided into three generations to manage, respectively, the earlier mentioned young generation, the old age and the permanent generation. The vast majority of objects are initialized to the young generation . In the old age , the objects are usually the young generation objects that have not been recycled for many times, as well as some very large objects, which are distributed directly to the old age. The permanent generation contains information about the JVM's ease of garbage collection management, such as objects that describe classes and methods, as well as classes and methods themselves.
The young generation contains an area called Eden and two slightly smaller survivor areas, such as.
Most objects are directly initialized in the Eden zone. (As mentioned earlier, a very large number of objects may be allocated directly to the old age) the survivor space holds those objects that survived at least once from the young generation garbage collection. The garbage collector will give these objects a chance to get back into the old age before they are still in the younger generation and can be recycled. At any given time, a survivor space (marked from) holds such an object, while the other is empty until the next garbage collection.
Garbage collection type
When the young generation object space is slow, the young generation of garbage collection begins (sometimes called minorgc). When the old age or the permanent object space is slow, the garbage collection that is performed is called MAJORGC. In general, the younger generation is a priority collection, and the recycling algorithms used are specifically designed according to the characteristics of their age, since it is often more efficient for young generations to identify and reclaim garbage. The old-age recovery algorithm was run simultaneously in the old and permanent generations. Once memory compression occurs, each generation is memory-compressed separately.
Sometimes, the old age is not enough space to continue to accept the young generation of the object. In this case, in addition to the CMS collector, all the phones will not be executed, the younger generation of the recovery algorithm will not be executed. Instead, the old-age recovery algorithm is used throughout the heap. (The old-age CMS algorithm is a special case because it will not be collected for younger generations.)
Quick distribution
In many cases, there is a large amount of contiguous space in memory to use for objects. The spatial allocation of these memory blocks is very efficient with the simple bump-the-pointer technique. The Bump-the-pointer technique is to track the end of the last freed object space by a pointer. When a new allocation request comes in, the JVM simply needs to determine if the space between the pointer and the end of the current generation is sufficient, nudge the pointer if it can, and initialize the object.
For multithreaded applications, the allocation operation must be thread-safe. If a global lock is used to ensure that the allocation operation is thread-safe, assigning an operation into a generation will become a performance bottleneck. Instead, the JVM uses a technique called thread-local Allocation buffer Technology (tlabs). This technique writes the allocation operation to the thread itself's buffer, increasing the throughput of the multithreaded allocation operation. Because, once each thread writes the allocation operation to its own buffer, it can be quickly allocated using bump-the-pointer technology, and the lock is not required for blocking operations. Of course, by chance, when the buffer inside the thread is filled up and unable to write more objects, synchronization must be used to secure the allocated thread. Of course, the use of Tlabs also has some technology to reduce space waste. The waste of tlabs space is less than 1% of the Eden area. The use of Tlabs technology and Bump-the-pointer technology has led to a high allocation of operational performance with a time of only 10 local instructions.
Serial collector
When using the serial collector, both the younger generation and the older mobile phones are serially collected (using a CPU), and the process of collecting will stop all execution of the application.
Serial collection--Young generation
Shows some of the actions that the young generation collects using the serial collector. The inventory object is copied from Eden to the empty Survivor space, which is the area of the map, of TO
course, if the object is too large to enter the To
region, but directly into the old age. In the Survivor From
area of the object, still relatively young objects are copied to To
space, and older objects will enter the old age. Note that if the To
space is full, the objects of Eden and the region that are not copied to To
the region From
will go straight into the old age, rather than how many generations of these objects have been recycled. Other objects that do not have copies Eden
and From
regions are no longer viable objects.
After a young generation collection is complete, either Eden or Survivor From
area, is empty, only the survivor region and the To
survival of the object, this time From
and To
the responsibilities of both will be reversed, reference:
Serial Collection--the old age
Old age the algorithm collected using the serial collector is the mark-sweep-compact collection algorithm, which, at the Mark Stage, identifies all surviving objects. In the sweep phase, garbage is removed. The collector performs sliding compression , moving the surviving objects to the starting position of the old age space in turn (as is the case for permanent generations), while leaving a larger contiguous space at the end of the old age. When the recovery is complete, the old age still supports bump-the-pointer technology for rapid distribution. Reference:
When to use the serial collector
Serial collectors are generally used only in applications that run on the client side, and these applications do not have much demand for application pauses. For today's devices, the serial collector can collect 64M of heap space in less than half a second.
The j2se5.0 was released for 2005 years in June, with the results of the test being based on the hardware performance at the time.
Selection of serial Collectors
In J2SE 5.0, in a non-server JVM, the default collector is the serial collector. If you use a different JVM, you can specify using the serial collector with the following parameters:
-XX:+UseSerialGC
Java Memory Management (i)