In-depth understanding of JVM performance in Java Virtual machines---garbage collection

Source: Internet
Author: User

First, the basic garbage collection algorithm


1. According to the basic recycling strategy

1) reference count (Reference counting)

Object adds a reference, that is, increments a count, and deletes a reference by one count. When garbage collection, only objects with a collection count of 0 are used. The most deadly of this algorithm is the inability to handle circular references.

2) Mark-Clear (Mark-sweep)

  

The implementation is in two stages. The first stage marks all referenced objects starting from the reference root node, the second stage traverses the entire heap, and the unmarked objects are purged.

The disadvantage is that the algorithm needs to pause the entire application while generating memory fragmentation.

3) Copy (Copying)

  

The memory space is divided into two equal areas, using only one of the areas at a time. During garbage collection, iterate through the current usage area and copy the objects in use to another area.

This algorithm only processes objects that are in use at a time, so the copy cost is small, and replication has been done in the past to make appropriate memory collation without "fragmentation" issues.

But the disadvantage is that it takes twice times the memory space.

4) Marking-finishing (Mark-compact)

   

This algorithm combines the advantages of the "mark-clear" and "copy" two algorithms. It is also divided into two stages, the first phase marks all referenced objects starting from the root node, the second stage traverses the entire heap, clears the unlabeled objects and "compresses" the surviving objects into one of the heaps, and discharges them sequentially.

This algorithm avoids the "mark-erase" fragmentation problem and avoids the space problem of the "copy" algorithm.

2. How to treat by partition

1) Incremental Collection

Real-time garbage collection algorithm, i.e., garbage collection while the application is in progress.

2) Collection of generations

Based on the object life cycle analysis, the garbage collection algorithm is derived. Divide the objects into young, old, and persistent generations, using different algorithms (one of these methods) for different life cycle objects. The current garbage collector (starting with j2se1.2) uses this algorithm.

3. Divide by System thread

1) Serial Collection

Use single-threaded processing for garbage collection work, because it is easy and efficient to implement without multi-threaded interaction. However, its limitations are also obvious, that is, the advantage of not using multiple processors, so this collection is suitable for single-processor machines. Of course, this collector can also be used on multiprocessor machines with a small amount of data (around 100M).

2) Parallel Collection

Use multithreading to handle garbage collection work, so it's fast and efficient. And the higher the number of CPUs in theory, the more the advantages of parallel collectors can be demonstrated.

3) Concurrent Collection

With respect to serial collection and parallel collection, the first two of the previous garbage collection work requires a halt to the entire running environment, and only the garbage collector is running, so the system will have a noticeable pause during garbage collection, and the pause time will be longer because the heap is larger and longer.

Second, how to distinguish the rubbish

  In addition to the "Reference counting" method, the subsequent implementation of the Garbage Judgment algorithm, the program runs from the root node, traversing the entire object reference, to find the surviving object. so where does garbage collection start in this way?

 In combination with heap and stack differences, where the stack is really where the program executes, so to get what objects are being used, you need to start with the Java stack. At the same time, a stack is corresponding to a thread, so if there are multiple threads, then all the stacks corresponding to those threads must be checked.

At the same time, in addition to the stack, there are system runtime registers, etc., is also stored program running data. Thus, with references in the stack or register as the starting point, we can find the objects in the heap and find references to other objects in the heap, which are gradually extended to end with a null reference or a primitive type, thus forming an object tree that is the root node of the object referenced in the Java stack. If there are multiple references in the stack, it will eventually form multiple object trees. Objects on these objects are objects that are required for the current system to run and cannot be garbage collected. Other remaining objects can be treated as objects that cannot be referenced, and can be recycled as garbage.

Therefore, the starting point for garbage collection is some root objects (Java stacks, static variables, registers ...). )。 The simplest Java stack is the main function that the Java program executes. This method of recycling is also the "mark-clear" recycling method mentioned above.

Third, the garbage collection strategy of generation

  1. Why generational?

   The life cycle of different objects is not the same. Therefore, different life cycle objects can be collected in different ways to improve the efficiency of recycling.

In the course of running a Java program, a large number of objects are generated, some of which are related to business information, such as session objects in HTTP requests, threads, socket connections, etc., which are directly linked to the business and therefore have a longer life cycle. However, there are some objects, mainly the temporary variables generated during the process of running the program, these objects will be shorter life cycle, such as: string object, because of its invariant class characteristics, the system will produce a large number of these objects, some objects even once can be recycled.
Therefore, if you do not distinguish between object survival time, each garbage collection is to reclaim the entire heap space, it takes a relatively long time, and because each collection needs to traverse all the surviving objects, but in fact, for the long life cycle of objects, this traversal is ineffective, Because they may have been traversed many times, they still exist. Therefore, the generational garbage collection uses the idea of divide and conquer, divides the generations, puts the objects of different life cycles on different generations, and uses the garbage collection method which is most suitable for it in different generations.

  2. How do I divide the generational?

1) Young generation

All newly generated objects are first placed in the younger generation. The goal of the young generation is to collect as quickly as possible those objects with short life cycles. The young generation is divided into a district. One Eden area, two survivor districts (in general). Most objects are generated in the Eden area. When the Eden Zone is full, the surviving objects will be copied to the Survivor area (one of two), and when the survivor area is full, the surviving objects of this area will be copied to another survivor area, when the survivor is full, Objects that are copied from the first survivor area and that are still alive will be duplicated in the old Age zone (tenured). It should be noted that the two areas of the survivor are symmetrical and have no relationship, so the same area may exist at the same time from Eden copied objects, and from the previous survivor copied objects, and copied to the old quarter only from the first survivor to come over the object. Moreover, there is always an empty survivor area. At the same time, according to the program needs, the survivor area can be configured as multiple (more than two), which can increase the time of the object in the younger generation, reduce the possibility of being put into the old generation.

2) Old generation

Objects that survived after n garbage collection in the younger generation will be placed in the old age. Therefore, it can be considered that older generations are storing objects with longer life cycles.

3) Durable generation

Used to store static files, now Java classes, methods, and so on. The persistence generation has no significant impact on garbage collection, but some applications may dynamically generate or invoke some classes, such as Hibernate, at which point a large, persistent generation space is required to store the new class in these runs. The persistent generation size is set by-xx:maxpermsize=<n>.

4) as follows:
   

5) What happens when the recycle is triggered?

Because objects are processed in a generational way, garbage collection areas and times are different. There are two types of GC: Scavenge GC and full GC.

    Scavenge GC

In general, when a new object is generated and the Eden application space fails, the scavenge GC is triggered, GC is performed on the Eden Zone, the non-surviving objects are cleared, and the surviving objects are moved to the survivor area. Then tidy up the two districts of survivor. This method of GC is carried out on the young generation of the Eden area and does not affect the old generation. Because most objects start in the Eden area, and the Eden area is not very large, GC in the Eden area is frequent. Thus, it is generally necessary to use fast and efficient algorithms, so that Eden can be free as soon as possible.

Full GC
Organize the entire heap, including young, tenured and perm. The full GC is slower than the scavenge GC because it needs to be recycled for the entire pair, so it should be as low as possible. In the process of tuning the JVM, a large part of the work is to adjust the FULLGC. The full GC may be caused by the following reasons:

    • · The old generation (tenured) was written full
    • · Persistent generation (Perm) is full
    • · System.GC () is displayed call
    • · Dynamic changes in the domain allocation policy of the heap after the last GC

In-depth understanding of JVM performance in Java Virtual machines---garbage collection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.