Elementary introduction to some basic concepts of GC garbage collector in Java _java

Source: Internet
Author: User
Tags compact garbage collection memory usage

First, the basic recovery algorithm

1. Reference count (Reference counting)
The older collection algorithm. The principle is that this object has a reference, that is, adding a count, and deleting a reference reduces a count. When garbage collection is collected, only objects with a collection count of 0 are used. The most lethal of this algorithm is the inability to handle circular references.
2. Mark-Clear (Mark-sweep)
This algorithm performs in two phases. The first stage marks all referenced objects starting from the reference root node, and the second phase traverses the entire heap, clearing the unmarked objects. This algorithm needs to suspend the entire application, while generating memory fragmentation.
3. Replication (copying)
This algorithm delimits the memory space to two equal regions, using only one of the regions at a time. When garbage collection, traverse the current area of use and copy the objects in use to another area. The secondary algorithm only processes the objects in use each time, so the replication cost is small, while the replication past can also be the corresponding memory collation, but there is a "fragmentation" problem. Of course, the disadvantage of this algorithm is also very obvious, that is, twice times the memory space required.
4. Mark-Finishing (mark-compact)
This algorithm combines the advantages of "tag-clear" and "replicate" two algorithms. It is also divided into two phases, the first phase marks all referenced objects from the root node, the second phase traverses the entire heap, clears the unmarked objects and "compresses" the surviving objects into one of the heaps, discharging them sequentially. This algorithm avoids the fragmentation problem of "tag-purge" and avoids the space problem of the "copy" algorithm.
5. Incremental collection (incremental collecting)
Implement the garbage collection algorithm, which is: garbage collection while the application is in progress. Don't know why the collector in JDK5.0 does not use this algorithm.
6. Generational (generational collecting)

Based on the analysis of the object life cycle, the garbage collection algorithm is derived. The object is divided into the young generation, the old generation, the permanent generation, the different life cycle objects using different algorithms (one of the above methods) for recycling. Now the garbage collector (starting from j2se1.2) uses this algorithm.

1. Young (younger generation)
The young generation is divided into three districts. An Eden area, two survivor districts. Most objects are generated in the Eden area. When the Eden is full, the surviving objects will be copied to the Survivor area (one of two), and when the survivor is full, the surviving objects of the area will be copied to another survivor area, and when the survivor is full, Objects copied from the first survivor area and still alive at this time will be replicated in the old age area (tenured). Note that the two areas of survivor are symmetrical and have no precedence, so there may be simultaneous objects in the same zone that are replicated from Eden, and those that were copied from the previous survivor, and those that were copied to the old age only came from the first survivor. Moreover, one of the survivor areas is always empty.
2. Tenured (older generation)
Older generations store objects that live from young generations. In general, older generations are storing objects that are longer in their lifetimes.
3. Perm (lasting generation)
Used to store static files, Java classes, methods, etc. today. Persistent generations have no significant impact on garbage collection, but some applications may dynamically generate or invoke some class, such as Hibernate, where a larger, persistent generation space needs to be set up to store the new classes in these runs. The persistent generation size is set by-xx:maxpermsize=<n>.


Ii. GC Type
There are two types of GC: Scavenge GC and full GC.

1. Scavenge GC
In general, when a new object is generated, and the Eden application space fails, it is good to trigger the scavenge GC, heap the Eden area for GC, purge the inactive objects, and move the surviving objects to the survivor area. Then organize the two districts of survivor.
2. Full GC
Organize the whole heap, including young, tenured and perm. The full GC is slower than the scavenge GC, so the full GC should be reduced as much as possible. A full GC can occur for the following reasons:
* tenured is written full
* Perm field is full
* System.GC () is displayed to call
* Dynamic change of domain allocation policy after last GC heap


Generational garbage Collection Process Demo

1.
<!--startfragment-->

2.

3.

4.

Second, the garbage collector


There are three main types of collectors: serial collectors, parallel collectors, and concurrent collectors.

1. Serial Collector

Use a single-threaded process for all garbage collection, because there is no need for multithreaded interaction, so it is more efficient. However, the advantages of multiprocessor are not available, so this collector is suitable for single processor machines. Of course, this collector can also be used on multiprocessor machines with small amounts of data (around 100M). You can use-XX:+USESERIALGC to open it.

2. Parallel collector
1. Parallel garbage collection for young generations can reduce garbage collection time. Typically used on multithreaded multiprocessor machines. Use-XX:+USEPARALLELGC. Open. The parallel collector, introduced in the j2se5.0 66th update, has been enhanced in the Java SE6.0--which can be collected in parallel in the older generation. If the older generation does not use concurrent collection, it is garbage collection using a single thread, thus restricting scalability. Open using-XX:+USEPARALLELOLDGC.
2. Use-xx:parallelgcthreads=<n> to set the number of threads for concurrent garbage collection. This value can be set equal to the number of machine processors.
3. This collector can be configured as follows:
* Maximum garbage collection pause: Specifies the longest pause time for garbage collection, specified by-xx:maxgcpausemillis=<n>. <N> milliseconds. If this value is specified, the heap size and garbage collection related parameters are adjusted to reach the specified value. Setting this value may reduce the throughput of the application.
* Throughput: Throughput is the ratio of garbage collection time to non garbage collection time, set by-xx:gctimeratio=<n>, Formula 1/(1+N). For example, when-xx:gctimeratio=19, it means that 5% of the time is used for garbage collection. The default is 99, or 1% of the time is used for garbage collection.

3. Concurrent collectors
Ensure that most of the work is concurrent (application does not stop), garbage collection is only paused for a very small amount of time, and this collector is suitable for medium and large scale applications with high response time requirements. Open using-XX:+USECONCMARKSWEEPGC.
1. The concurrent collector mainly reduces the pause time of the older generation, and he uses a separate garbage collection thread to track the accessible objects without stopping the application. In each old generation garbage collection cycle, the concurrent collector at the beginning of the collection will briefly pause the entire application, pausing again in the collection. The second pause is slightly longer than the first, and multiple threads are garbage collected at the same time during this process.
2. The concurrency Collector uses the processor for a short pause. On an N-processor system, the Concurrent Collection section uses k/n available processors for recycling, typically 1<=K<=N/4.
3. The concurrent collector is used on a host that has only one processor, and a shorter pause time is set to incremental mode mode.
4. Floating garbage: Because garbage collection occurs while the application is running, some of the garbage may be generated when the garbage collection is complete, resulting in a "floating garbage" that needs to be recycled at the next garbage collection cycle. Therefore, the concurrent collector generally requires 20% of the reserved space for these floating garbage.
5. Concurrent Mode failure: Concurrent collectors are collected at application run time, so it is necessary to ensure that the heap during the garbage collection period has enough space for the program to use, otherwise, garbage collection is not completed, heap space first full. In this case, "concurrency mode Failure" will occur, at which time the entire application will be paused for garbage collection.
6. Start Concurrent Collector: Because concurrent collections are collected at application run time, you must ensure that sufficient memory space is available before the collection is complete for the program to use, otherwise "Concurrent Mode failure" appears. Start concurrent collection by setting-xx:cmsinitiatingoccupancyfraction=<n> specify how many remaining heaps to run

4. Summary
* Serial Processor:
--Application: The amount of data is small (100M or so), single processor and response time is not required to apply.
--Disadvantage: Only for small applications
* Parallel processors:
--application for high throughput requirements, multiple CPUs, medium and large applications with no requirements for application response time. Examples: Background processing, scientific calculation.
--Disadvantage: Application response time may be longer
* Concurrent Processor:
--Applicable: "High response time Requirements", multiple CPUs, the application response time has a higher demand for medium and large applications. Examples: Web server/Application server, telecommunications Exchange, integrated development environment.

Three, GC basic principle
GC (Garbage Collection), is the java/. NET in the garbage collector.
Java is the development of C + +, it abandoned the C + + some cumbersome error-prone things, introduced the concept of the counter, one of which is the GC mechanism (C # borrowed Java)
Where programmers are prone to problems, forgetting or faulty memory recycling can cause a program or system to become unstable or even crash, the GC feature provided by Java can automatically monitor whether an object exceeds the scope to automatically reclaim memory, and the Java language does not provide a display operation to free allocated memory. So, Java's memory management is actually the management of objects, including the allocation and release of objects.
For programmers, the assigned object uses the New keyword, and when the object is disposed, it cannot be accessed by the program unless all references to the object are assigned null, which we call the object "unreachable." The GC is responsible for reclaiming the memory space of all "unreachable" objects.
For GC, when a programmer creates an object, the GC begins to monitor the address, size, and usage of the object. Generally, the GC uses a heap to record and manage all the objects in the heap (). In this way, determine which objects are "accessible" and which objects are "unreachable." When the GC determines that some objects are "unreachable", the GC is responsible for reclaiming the memory space. However, in order to ensure that GC can be implemented on different platforms, the Java specification does not strictly regulate many behaviors of GC. For example, there are no clear rules on what type of recovery algorithm to use and when to recycle. Therefore, the implementations of different JVMs often have different implementation algorithms. This also brings a lot of uncertainty to the development of Java programmers. In this paper, several problems related to GC work are studied to reduce the negative impact of this uncertainty on Java programs.

Iv. GC Division
in the JVM memory model heap distinguishes two chunks, one is young Generation and the other is old Generation

1) In young generation, there is a space called Eden, which is mainly used to store newborn objects, and two survivor spaces (from and to), they are always the same size, They are used to store objects that survive each garbage collection.
2) in the old generation, the main memory object that holds the long life cycle in the application.
3) In young generation block, garbage collection is generally used copying algorithm, fast. At each GC, the surviving objects are first copied from Eden to some survivorspace, and when survivor space is full, the remaining live objects are copied directly to the oldgeneration. Therefore, after each GC, the Eden memory block is emptied.
4) In old generation blocks, garbage collection is typically done with a mark-compact algorithm, slower, but with reduced memory requirements.
5) Garbage collection is multi-level, 0 is all (full) garbage collection, will be recycled old section of garbage, 1 or above for partial garbage collection, will only recover Young's garbage, memory overflow usually occurs after old or perm segment garbage collection, There is still no memory space to accommodate the new Java object.

Five, incremental GC
incremental GC (incremental GC), where the GC is typically implemented by one or a group of processes in the JVM, and it itself occupies heap space as well as the user program, and the runtime consumes the CPU.
The application stops running when the GC process is running. Therefore, when the GC is running for a long time, the user can feel the pause of the Java program, on the other hand, if the GC is running too short, the object recovery rate may be too low, which means that there are many objects that should be recycled are not recycled, still occupy a lot of memory. Therefore, when designing a GC, you must weigh the pause time against the recovery rate. A good GC implementation allows users to define the settings they need, such as some memory-limited devices that are very sensitive to memory usage, and want the GC to be able to accurately reclaim memory, which does not care about the speed of the program. In addition to some real-time online games, it is not possible to allow a program to be interrupted for a long time. The
Incremental GC is a way to reduce the impact of GC on user programs by using a certain collection algorithm that divides a long break into many small interrupts. Although an incremental GC may not be as efficient as a regular GC in overall performance, it can reduce the maximum downtime of a program. The hotspot JVM provided by the
Sun JDK supports the incremental GC. HotSpot JVM Default GC method is to not use incremental GC, in order to start incremental GC, we must add-XINCGC parameters when running Java program. The
HotSpot JVM incremental GC is implemented using the train GC algorithm, the basic idea is to group all the objects in the heap by grouping (tiering) by creating and using them, putting objects in the first team with frequently high and relevant dependencies, and constantly adjusting the group as the program runs. When the GC is running, it always reclaims the oldest (most recently infrequently accessed) objects, and if the entire group is recyclable, the GC reclaims the entire group. In this way, each GC run only a certain proportion of the unreachable object, to ensure the smooth operation of the program.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.