First, the memory area of the JVM
For C, C + + programmers, in the field of memory management, they both have the "ownership" of each object, and assume the responsibility of maintaining each object's life beginning to the end.
For Java programmers, with the help of the virtual machine's automatic memory management mechanism, it is no longer necessary to write a pair of Delete/free code for each new operation, which is not prone to memory leaks and memory overflow problems.
1. Memory Area
According to the Java Virtual Machine specification (Java SE version 7), the memory managed by the Java Virtual machine will include the following runtime data regions,
Thread-Private Memory area:
- Program Counter : Can be seen as the line number indicator of the current thread execution byte code, the bytecode interpreter works by changing the value of the counter to select the next required code to execute the bytecode directive
- Virtual Machine Stack : a stack frame executed by a Java method that stores information such as local variable tables, operand stacks, dynamic links, method exits, and so on. Each method from the call to the completion of the process, all corresponding to a stack frame in the virtual machine stack into the stack of the process
- Local Variables Table : The Basic data Types (Boolean, Byte, char, int, and so on), object references (reference types), and ReturnAddress types (addresses that point to a bytecode directive) are stored at compile time
- Local method Stack : Stack frame for native method execution
Memory areas shared by all threads:
- heap : Storing object instances and arrays
- method Area : Stores data such as class information loaded by the virtual machine, final constant, static static variable, instant compiler compiled code, and more
- run A constant pool: holds the various literal and symbolic references generated by the compilation, and may also put new constants into the pool during run time
2. Creation of objects
At the language level, creating objects (for example, clone, deserialization) is usually a new keyword, and in a virtual machine, what is the process of object creation?
When the virtual opportunity comes to the new directive:
1. Class loading : Ensure that the constant pool holds an interpreted class, and that the object has a type that has already been initialized, and if not, perform class loading first
2. Allocating memory for a new object : The size of the memory required by the object can be determined when the class is loaded, dividing the determined size of memory from the Java heap
- To allocate an idle memory method:
- Pointer collision: If the heap is regular, used memory and idle memory on each side, the middle using pointers as the demarcation point, allocating memory when the pointer moves the object size of the distance
- Idle list: If the heap is not structured, the virtual machine needs to maintain which memory blocks are available for the list, allocate time to find a large enough free memory partition from the list, and update the list records
- Object creation guarantees thread safety in concurrency: for example, a memory is being allocated to object A, the pointer has not been modified, and object B uses the original pointer to allocate memory
- CAs with failed retry
- Local thread allocation buffer Tlab (ThreadLocal Allocation buffer): Divides memory allocation actions into different spaces by thread, that is, each thread pre-allocates a small chunk of memory in the Java heap
3. Initialize the allocated memory space to a value of 0: Ensure that instances of the object can be used directly without assigning values in Java code, and can access the 0 values corresponding to the data types of these fields (for example, the int type parameter defaults to 0)
4. Set the object header : Set the object's class metadata information, hash code, GC generational age, etc.
5. Execution <init> Method Initialization : Initialize the object according to the programmer's wishes
3. Memory layout of objects
In a hotspot virtual machine, the layout of objects stored in memory is divided into 3 regions, as shown in:
- Object Header (header):
- Markword: Stores the runtime data of the object itself, for example: hash code hashcode, GC generational age, lock status flag, thread-held lock, biased thread ID, and so on. Considering space efficiency, Markword is designed to be a non-stationary data structure that re-uses its own space based on the different states of the object, as shown in:
-
- pointer to class: A pointer to the metadata of its class that the virtual machine uses to determine which class is an instance of
- If the object is a Java array, the object header also requires a piece of data that records the length of the array
- instance Data Instance: Valid information that an object is actually stored, as well as the contents of various types of fields defined in the program code
- align Padding (Padding): The role of the placeholder. Because the hotspot VM requires that the object start address must be an integer multiple of 8 bytes, the size of the object must be an integer multiple of 8 bytes. When the object instance data part is not aligned, you need to align the padding to complement
4. Memory Overflow exception
In addition to the program counters, OutOfMemoryError exceptions can occur in several other runtime regions of the JVM.
1. Heap memory overflow,Outofmemoryerror:java heap Space
cause : The Java heap is used to store object instances, as long as the objects are constantly created, and the GC roots to objects that have an attainable path between them to avoid the GC of these objects, when the number of objects reaches the maximum capacity limit of the heap, an oom is generated
Workaround :
- Parameter -xx:heapdumponoutofmemoryerror allows the virtual machine to dump the current memory heap dump snapshot when a memory overflow exception occurs
- Determine if a memory leak or memory overflow is detected by a dump dump snapshot analysis using the Memory image analysis tool (e.g. Eclipse memory analyzer)
- If this is a memory leak: The tool looks at the type information of the compromised objects and their reference chain information to the GC Roots, analyzes why the GC collector cannot automatically reclaim them, and locates the code location for memory leaks
- If it is a memory overflow: Check the heap parameters -xms and-xmx, see if it can be adjusted to large, the code to check the life cycle of some objects too long, holding time too long, to try to reduce the memory consumption during the program run
2. Stack memory overflow, stackoverflowerror
reason :
- Stackoverflowerror Exception: Thread request has a stack depth greater than the maximum allowed by the virtual machine
- OutOfMemoryError exception: Unable to request sufficient memory space when virtual machine expansion stack
Workaround :
- Check the code for dead recursion; Configure -XSS to increase the stack memory capacity of each thread, but reduce the number of worker threads and need to weigh
Second, garbage collection strategy 1, object survival judgment
Almost all object instances are stored in the heap, and the GC collector first determines which objects are "alive" and "dead" before the heap is recycled.
1. Reference counting method
Add a reference counter to each object, and whenever there is a place to reference it, counter +1; When the reference fails, the counter-1. When the counter is 0 o'clock, the object is no longer referenced.
But mainstream Java virtual machines do not use this algorithm, mainly because it is difficult to solve the problem of circular reference between objects
2. Accessibility Analysis algorithm
By a series of objects called "GC Roots" as starting points, searching down from these nodes, the search path is called the reference chain. When an object is not connected to a GC Roots (that is, from a GC Roots to an object that is unreachable), this object is not available and is judged to be a recyclable object.
In Java, the objects that can be used as GC Roots include:
- The object referenced in the stack (local variable table in the stack frame)
- Object referenced by class static static property in the method area
- Object referenced by final constant in the method area
- Objects in the local method stack that are referenced by JNI
2. Garbage collection Area
Garbage collection is primarily a collection of heap memory . In the heap, the new generation of conventional applications to a GC generally recyclable 70%~95% space, the permanent generation of GC efficiency is much lower than this
The "Price/performance" of garbage collection in the method area is generally low, which mainly recycles two parts: obsolete constants and useless classes
- Discard constant Recycle: If the string of a constant pool, for example: "ABC", does not have any string object referencing the literal, the "ABC" constant is cleared out of the constant pool. Symbolic references to other classes, methods, and fields in a constant pool are similar to this
- Useless class Recycling: classes need to meet the following 3 criteria to be considered "useless classes"
- All instances in the heap of the class are recycled
- The class Loader that loaded it has been recycled
- The corresponding Java.lang.Class object of this class is not referenced anywhere, and the method of accessing the class cannot be reflected anywhere
Out-of- heap memory is memory that is allocated outside the heap of a Java virtual machine, including memory allocated during the JVM's own run, memory allocated in JNI, memory allocated by Java.nio.DirectByteBuffer, etc., which are directly managed by the operating system. This can reduce the impact of GC on applications to a certain extent. However, the JVM does not directly manage these out-of-heap memory, and there is a risk of OOM, which limits the amount of out-of-heap memory requested by the JVM startup parameter plus-xx:maxdirectmemorysize
The Directbytebuffer object represents the out-of-heap memory, which holds the Cleaner object in the Directbytebuffer object, which uniquely holds the data, start address, size, and capacity of the out-of-heap memory. The Cleaner object reclaims out-of-heap memory when the next full GC is created
3. Garbage collection algorithm
① tag-Purge algorithm
The tag-purge algorithm is divided into "mark" and "purge" phases. The mark stage is to mark all active objects. The purge phase is the collection of objects that are not marked (Inactive objects)
It has two main deficiencies:
- Efficiency issues: Marking and clearing two processes is inefficient
- Space problem: There is a lot of discontinuous memory fragmentation after the mark is cleared. Excessive space fragmentation can cause subsequent needs to allocate large objects, unable to find enough contiguous memory and have to trigger another GC in advance
② Copy Algorithm
The replication algorithm divides the available memory into two blocks of equal size, using only one block at a time, and when a piece of memory is exhausted, copies the surviving objects to another, and then cleans up the used memory space once.
This allocates memory without the complexity of memory fragmentation, but at the expense of shrinking the memory to half the original. When the object survival rate is high, more replication operations are required and the efficiency will be reduced.
Today's commercial virtual machines are using replication algorithms to reclaim the new generation. IBM special research shows that: the new generation of object 98% is "to die" , all do not need 1:1 to partition space, the hotspot virtual machine is to divide the memory into 1 large Eden and 2 small Survivor space, size ratio of 8:1:1. Use Eden and one piece of Survivor at a time. When reclaimed, copies the surviving objects from Eden and one Survivor to another Survivor, and finally cleans up the Eden and Survivor. The 90%,10% of the whole new ecological memory that can be exploited by the Cenozoic will be wasted. But when a surplus of 10% objects is recovered each time, that is, the remaining Survivor space is not enough, the old age of memory guarantee , these objects will go directly into the old age.
③ Labeling-Sorting algorithm
Tag-grooming algorithms are in the "mark" phase and mark-clear, but follow-up is to let all surviving objects move toward one end, and then clean out the memory outside the end boundary
④ Generational algorithm
According to the different life cycle of the object to divide the memory into a few pieces to see, generally the heap into the "young generation" and "old age", according to the characteristics of each era to adopt appropriate collection algorithm.
in the new ecosystem, only a small number of objects survive each GC , and the replication algorithm is used, which can be collected only with a small amount of the replication cost of the surviving objects.
In the old age, where the object has a high survival rate and no additional guarantee space, it must use the "mark-clear" or "mark-sweep" algorithm
4, garbage collector comparison
Garbage Collection Algorithm performance:
- throughput : Run user code time/(run user code time + garbage collection time). The higher the throughput, the more efficient the CPU utilization, the better the algorithm
- Max pause Time : The maximum amount of time to pause an application thread due to a GC. The shorter the pause time, the better the algorithm
high throughput and low pause times cannot be combined . For maximum throughput, the JVM must run the GC as little as possible, running the GC only if it is forced to do so, such as: The new generation or the old age is full. However, the result of deferring the GC is that there are a lot of things to do each time the GC is run, such as more objects accumulating on the heap waiting to be reclaimed, so the GC time is higher each time, resulting in a high average and maximum pause time.
The garbage collector is a concrete implementation of the memory recovery algorithm. This article mainly introduces the garbage collector in the HotSpot virtual machine:
If there is a connection between the two collectors, they can be used together. The functions of each garbage collector are compared in the following table:
What kind of garbage collector should I use?
1. Client program: Generally use-XX:+USESERIALGC (Serial + Serial old). In particular, when multiple JVMs are on a single machine, each JVM can also use this GC combination
2. Service-side program with throughput priority (compute-intensive):-XX:+USEPARALLELGC or-XX:+USEPARALLELOLDGC
3. Response time Priority service-side program:-XX:+USECONCMARKSWEEPGC
4. Response time priority also to take into account the throughput rate of the server program:-XX:+USEG1GC
5. CMS garbage collector
The CMS (Concurrent Mark Sweep) garbage collector is a garbage collector that targets the shortest payback time . General b/S or Internet Station Service side pay more attention to the response speed, hope that the system of the shortest pause time, so as to bring users a better experience, CMS is more in line with the needs of such applications.
① Execution Process
CMS is implemented based on the " tag-purge " algorithm, described in the "Copy GC algorithm" above, the Cenozoic 98% objects are facing to die, so the Cenozoic is divided into 1 Eden and 2 survivor zone (the default memory size is 8:1:1), Each time you use Eden and a survivor zone, the live object is copied to the remaining survivor area, and the space of the Eden and survivor areas used previously is cleared.
Its operational process is divided into the following stages:
1. Initial tag (need Stop the World): Mark the object that GC Roots can directly relate to, speed quickly
2. Concurrent tagging (working with user threads): GC Roots tracing process, for example: A is the object that the GC root is associated with, a reference b,a is marked in the initial phase, and this stage is the marker B object
3. Concurrent Pre-cleanup (working with user threads): Concurrent lookups reduce the "re-tagging" phase of work by moving from a new generation to an object in the old age, or by a large object that was assigned directly in the old age, or updated by the user thread, in the concurrency tagging phase
4. Re-mark (need Stop the World): fixed concurrency token and concurrent pre-clean user thread executing concurrently with GC thread, user thread produced new object and re-tagged these objects. This phase STW time is longer than the "initial tag" stage, but is much shorter than the "concurrent tagging" period. Pauses the user thread, the GC thread rescan the objects in the heap, makes the accessibility analysis, and marks the living object
5. Concurrent Cleanup (working with user threads): Remove unused objects and reclaim the heap space they occupy. At this time will generate new garbage, in this GC can not clear, had to wait until the next cleanup, the garbage named: Floating garbage
6, concurrent reset : Re-set the internal CMS data structure, prepare for the next CMS life cycle use
The concurrency tagging phase modifies how objects are handled?
In the above CMS GC process, the 3rd step: Concurrent pre-cleanup, how to handle the concurrent tagging phase modified objects? When the reference to the initial tag stage is a→b→c, and the concurrent tag, the reference relationship is changed from the user program to A→c,b no longer references C, because C cannot be flagged in the "Concurrency tag" phase, it is recycled, which is not allowed.
This can be resolved by a tri-color notation, which divides the objects in the GC into three cases:
- Black: Both itself and its sub-objects are scanned for finished objects, not as garbage objects, not by GC
- Gray: The object itself is scanned, but the child object has not yet been scanned
- White: Not scanned objects, after scanning all objects, the final white as unreachable objects, will be treated as garbage objects
When the initial tag is marked, A is marked as gray (a dependent is being scanned), then a reference is scanned, the B is marked gray, and a is scanned and turned black.
Concurrency token, if the user thread changes a reference to C, that is a→c, when the CMS finds a reference (c) of a white object in the write barrier (write Barrier) that is assigned to a field in the Black object (A), the white object of C is set to Gray, which is the increment Updates (imcremental update).
What happens to the new generation of objects in the old age when GC is processed?
The JVM uses the card marking method to avoid Minor GC when it needs to scan the entire old age. The practice is: the old generation in accordance with a certain size of shards, each one corresponding to Cards, if the old age of the object has changed or pointed to the new generation of objects, the old age of the Card is marked as dirty. When young GC, dirty card is added to the GC Roots range to be scanned to avoid scanning the entire old age
Advantages and disadvantages of ②CMS
Advantages:
1, Concurrent collection, low pause , some of the sun's official documents are also known as the concurrent low-pause collector (Concurrent. Collector)
Disadvantages:
1. very sensitive to CPU resources : In the concurrency phase, it will not cause the user thread to pause, but will cause the application to slow down due to a portion of the thread (or CPU resources), the total throughput is reduced
2. Create space debris : based on the "mark-clear" algorithm implementation, means that after the end of the collection there will be a large number of space debris generation, the allocation of large objects to cause trouble
3, need a larger heap space : The CMS tag phase of the application continues to execute, there will be heap space to continue to allocate the situation, in order to ensure that the CMS before the heap is reclaimed before the space allocated to the running program, must reserve a portion of space
③CMS Tuning Strategy
-xx:cmsinitiatingoccupancyfraction=70 : This value represents the usage of heap space in the old age, the default value is 92, if set to 70, it means that the first CMS garbage collection will be triggered when the old age occupies 70%. The STW time process, too small will affect the throughput rate
-xx:+usecmscompactatfullcollection,-xx:cmsfullgcsbeforecompaction=4: After performing 4 uncompressed full GC, a memory compression process is performed. Used to eliminate memory fragmentation
-xx:+concgcthreads: The number ofthreads in the concurrent CMS process runtime, the CMS default number of recycled threads is (cpu+3)/4. More threads speed up the concurrent garbage collection process, but bring additional synchronization overhead.
Younger generations: Young's GC frequency is high, then the new generation; young GC time is longer, then the new generation is reduced. Try to recover most of the rubbish at young GC
6, G1
G1 (Garbage-first) is a garbage collector for service-side applications, and G1 is designed to minimize STW outage times, often limiting GC pause times to more than maximizing throughput rates. In Java9, G1 has become the default garbage collector.
① Execution Process
G1 's memory layout differs greatly from other garbage collectors by dividing the entire Java heap into n equal-sized region, each of which occupies a contiguous virtual memory address. The new generation and the old age are no longer physical isolation, but a collection of parts of the region.
The size of region can be specified by-xx:g1heapregionsize and, if not set, the heap memory is evenly divided into 2048 parts by default. G1 is still a generational collector, with the exception of the Eden, Survivor, and old areas, and the humongous area, which is used to store huge objects (one object occupies space >50% partition capacity), Reduce the negative impact of a short-lived mega-object on the garbage collector.
The G1 operation process is divided into the following steps:
1, Global concurrency tag : Based on the STAB (snapshot-at-the-beginning) Form of the concurrency tag, after the mark is complete, G1 basically know which area is empty, it first collects which output a large amount of free space, which is also named Reasons for Garbage-first
1.1 initial tag (STW, Time-consuming): marks the objects that the GC Roots can directly relate to, and presses their fields into the scan stack
1.2 Concurrent tokens (concurrent execution with a user thread takes a long time): The GC thread constantly pulls the reference from the scan stack and then recursively marks it until the scan stack empties
1.3 Final Tag (STW): The part of the tag that marks the change of reference during the execution of the user program (the object written to the barrier write Barrier tag) during the "Concurrency token"
1.4 Cleanup (STW): Count the number of objects that each region is tagged to survive, and if a region with no surviving objects is found, it will be recycled into the assignable region as a whole
2. Copy the surviving object : Copy the surviving objects in part of region to the empty region, then reclaim the space of the original region.
G1 GC can be divided into young GC and Mixed GC two types. Young GC is the region that selects all the new ecosystems, controlling the cost of the younger GC by controlling the number of region in the Cenozoic. The Mixed GC is the region of the selected Cenozoic, plus a number of older years with a higher yield based on the "global Concurrency Tag" statistic, as far as possible in the user-specified pause time to select the higher-yielding older era. There is no full GC in G1, and the collection of old age is done by Mixed GC.
In G1, use Rememberd Set to keep track of object references within the region to avoid full heap scanning. Each region has a corresponding rememberd Set, and when the program writes to the Reference type of data, a write Barrier pauses the write operation, checking to see if the object Reference referenced is in a different region , if it is, records the relevant reference information through Cardtable to the Rememberd Set of the region to which the referenced object belongs. When GC, adding the Rememberd Set to the enumeration scope of GC Root guarantees that the whole heap is not scanned and is not missed.
Comparison of ②G1 and CMS
G1 's design goal is to cancel the CMS collector, G1 has some obvious advantages over the CMS:
1, simple and feasible performance tuning :-xx:+useg1gc-xmx32g, using these two parameters can be applied to the production environment, indicating the opening G1, the maximum heap memory is 32G;-xx:maxgcpausemillis=n Use this parameter to set the expected pause time in the GC. eliminates the physical space division of the old age, eliminating the need for size settings for each generation of space
2, predictable STW pause time : G1 In addition to the pursuit of low pauses, but also to establish a predictable pause time model, allowing users to explicitly specify the GC's pause time of not more than n milliseconds. This is done by tracking the value of the garbage accumulation in each region (the amount of space collected and the amount of time it takes to reclaim), maintaining a prioritized list in the background, and prioritizing the region with the highest value per time, based on the allowable collection times, ensuring G1 Get the highest possible collection efficiency in a limited amount of time
3, Space integration : The G1 of the two region is based on the "Replication" algorithm implementation, during operation will not produce memory fragmentation, allocation of large objects will not be able to find continuous space in advance to start full GC
③CMS Tuning Strategy
-xx:maxgcpausemillis=n: when setting the GC maximum pause time, this target does not necessarily meet, the JVM will do its best to implement it, not recommended too small (<50ms)
-xx:initiatingheapoccupancypercent=n: Triggers G1 start Mixed GC, which represents the ratio of garbage objects to the total G1 heap memory space
Avoid using other explicit options such as-XMN or-xx:newratio to set the size of a young generation to cover the pause time target
Reference
Deep understanding of Java Virtual Machines (2nd edition)
Getting Started with the G1 garbage Collector
Deep understanding of Java G1 garbage collector
Some key technologies of Java Hotspot G1 GC