Java memory model and GC principle

Source: Internet
Author: User

Go to: http://blog.csdn.net/ithomer/article/details/6252552

An excellent Java programmer must understand the Java memory model, how GC works, and how to optimize GC performance, limited interaction with GC, some applications have high performance requirements, such as embedded systems, real-time systems, etc., only to comprehensively improve the management efficiency of memory To improve the performance of the entire application.

This article discusses the JVM memory model, the GC's working principle, and several key issues in GC to improve the performance of Java programs from a GC perspective.

One, Java memory model

Officially, the Java virtual machine has a heap, the heap is the runtime data region, and all the class instances and arrays of memory are allocated from here.

The JVM primarily manages two types of memory: heap and non-heap, heap memory, which is created when a Java virtual machine is started, and non-heap memory (Non-heap memories) is memory outside the JVM heap.

In short, a heap is a Java code-readable memory that is left to the developer, and not a heap that is left to itself by the JVM, containing the required memory for the method area, internal processing of the JVM, or optimization (such as Jitcompiler,just-in-time Compiler, the immediately compiled code cache), Code for each class structure, such as running a constant pool, fields, and method data, and methods and construction methods.

The JVM memory contains the following sections:

    • Heap Memory: Storing Java objects
    • Non-heap (non-heap memory): Store class loading information and other Meta-data
    • Miscellaneous (Other): Store JVM's own code, etc.

When the JVM starts, it retains a fixed amount of memory to the heap memory, which is not necessarily used by the JVM, but it can be determined that this portion of the reserved memory is not used by other processes, which is -Xmx 参数指定。 allocated to the JVM when the JVM is started in the other part, As the initial heap memory usage of the JVM, this portion of memory is specified by the-XMS parameter.

Detailed configuration file directory: Eclipse/eclipse.ini

When the default free heap memory is less than 40%, the JVM increases the heap until the maximum limit of-xmx, which can be specified by-xx:minheapfreeratio.

When the default free heap memory is greater than 70%, the JVM reduces the heap until the minimum limit of-XMS, which can be specified by-xx:maxheapfreeratio, as described in

Can be -XX:MaxPermSize  set by non-heap size, see my Baidu Blog for details

Second, Java memory allocation

Java's memory management is actually the management of variables and objects, including the allocation and release of objects.

The JVM memory request process is as follows:

    1. The JVM tries to initialize a chunk of memory in Eden for the relevant Java object
    2. When the Eden space is sufficient, the memory request ends;
    3. The JVM tries to free all inactive objects in Eden (this is a 1 or more advanced garbage collection), and if Eden space is still insufficient to fit into the new object after release, attempt to put some of the active objects in Eden into the survivor area
    4. The survivor area is used as an intermediate swap area for Eden and old, and when the old area is sufficiently large, the objects in the survivor area are moved to the old area, otherwise they will remain in the survivor area.
    5. When there is not enough space in the old area, the JVM performs a full garbage collection in the old area (level 0)
    6. After a complete garbage collection, "Out of memory" error occurs if the survivor and old areas still cannot hold portions of objects copied from Eden, causing the JVM to be unable to create an area for the new object in the Eden area

Iii. Fundamentals of GC

GC (Garbage Collection), is java/. The garbage collector in the. Net.

Java is developed by C + +, it rejects some of C + + cumbersome error-prone things, introduced the concept of counter, one of which is this GC mechanism (C # borrowed from Java)

Where programmers are prone to problems, forgotten or faulty memory recycling can cause program or system instability or even crashes, and the GC functionality provided by Java can automatically monitor whether an object exceeds the scope to automatically reclaim memory, and the Java language does not provide a way to release the displayed operation of the allocated memory. Therefore, Java's memory management is actually the management of objects, including the allocation and release of objects.

For programmers, the allocation object uses the new keyword; When you dispose of an object, you assign null to all references to the object, so that the program cannot access the object again, and we call the object "unreachable." The GC will be responsible for reclaiming the memory space of all "unreachable" objects.

For GC, when a programmer creates an object, the GC starts to monitor the object's address, size, and usage. Typically, a GC uses a graph to record and manage all objects in the heap. In this way, you determine which objects are "accessible" and which are "unreachable". When the GC determines that some objects are unreachable, it is the GC's responsibility to reclaim those memory spaces. However, in order to ensure that the GC can be implemented on different platforms, the Java specification does not strictly govern many of the GC's behavior. For example, there are no clear rules as to what type of recovery algorithm to use, when to recycle, and so on. Therefore, different JVM implementations often have different implementation algorithms. This also brings a lot of uncertainty to the development of Java programmers. This paper studies several issues related to GC work and tries to reduce the negative impact of this uncertainty on Java programs.

Iv. Division of GC Generation

The heap in the JVM memory model distinguishes two chunks, one young Generation and the other old Generation

1) in young generation, there is a space in the Eden space, mainly for the newborn objects, and two survivor Spaces (from, to), they are always the same size, they are used to store each garbage collection survived after the object.

2) in the old generation, the memory objects that have long life cycles in the application are primarily stored.

3) in young generation block, garbage collection is generally used copying algorithm, fast. At each GC, the surviving objects are first copied from Eden to a survivorspace, and when the survivor space is full, the remaining live objects are copied directly to Oldgeneration. Therefore, after each GC, the Eden memory block is emptied.

4) in the old generation block, garbage collection generally uses the mark-compact algorithm, slower, but reduces memory requirements.

5) Garbage collection sub-level, 0 level for all (full) garbage collection, the old section of garbage collection, 1 or more of the garbage collection, will only recover the garbage in young, memory overflow usually occurs after the old or perm segment garbage collection, There is still no memory space to accommodate new Java objects.

Five, incremental GC

Incremental GC (Incremental GC), a GC that is typically implemented by one or a set of processes in the JVM itself, consumes the same heap space as a user program and consumes CPU at run time.

When the GC process runs, the application stops running. Therefore, when the GC is running longer, the user can feel the Java program's pause, on the other hand, if the GC runs too short, the object recovery may be too low, which means that many of the objects that should be recycled are not recycled and still occupy a lot of memory. Therefore, when designing a GC, a tradeoff must be made between the pause time and the recovery rate. A good GC implementation allows the user to define the settings they need, such as some memory-constrained devices, which are very sensitive to memory usage, and want the GC to be able to accurately reclaim memory, which does not care about the speed of the program. In addition to some real-time online games, it is not possible to allow long interruptions to the program.

Incremental GC is the use of a certain recovery algorithm, a long interruption, divided into a number of small interruptions, in this way to reduce the impact of GC on the user program. Although an incremental GC may not be as efficient as a normal GC in overall performance, it can reduce the maximum downtime of a program.

The hotspot JVM provided by the Sun JDK can support incremental GC. The HotSpot JVM default GC mode is to not use Delta GC, in order to start the Delta GC, we must increase the-XINCGC parameter when running the Java program.

The implementation of the HotSpot JVM incremental GC is based on the train GC algorithm, which is based on the idea that all objects in the heap are grouped (layered) by creation and usage, that they are placed in a team with frequently high and relevant objects, and that the group is constantly tuned as the program runs. When the GC is running, it always recycles the oldest (recently infrequently accessed) objects, and if the whole group is recyclable, the GC will recycle the whole group. In this way, each GC runs only a certain percentage of unreachable objects to ensure the smooth running of the program.

Six, the detailed function finalize

Finalize is a method that is located in the object class, as described in my open source project: src-jdk1.7.0_02

protected void Finalize() throws throwable { }

The access modifier for this method is protected, because all classes are subclasses of object, so the user class is easily accessible to this method.

Because the Finalize function does not automatically implement chained calls, we must implement them manually, so the last statement of the Finalize function is usually super.finalize (). In this way, we can implement finalize calls from bottom to top, freeing our own resources before releasing the resources of the parent class. According to the Java language Specification, the JVM guarantees that the object is unreachable until the Finalize function is called, but the JVM does not guarantee that the function will be called. In addition, the specification guarantees that the Finalize function runs at most once.

Many Java beginners will think that this method is similar to the destructor in C + +, the release of many objects, resources are placed in this function. In fact, this is not a good way, for three reasons:

First, the GC, in order to support the Finalize function, does a lot of extra work on the object that overrides the function.

Second, after finalize runs, the object may become reachable, and the GC will check again if the object is reachable. Therefore, using finalize reduces the performance of the GC.

Third, because the time that the GC calls Finalize is indeterminate, freeing the resource in this way is also indeterminate.

In general, finalize is used for the release of some non-manageable and very important resources, such as some I/O operations, data connections. The release of these resources is critical for the entire application. In this case, the programmer should be managed by the program itself, including the release of these resources, with the Finalize function to release resources as a supplement to form a double-insurance management mechanism, and should not rely solely on finalize to release resources.

An example is given below that the Finalize function may still be reachable after it is called, and that the finalize of an object can only run once.

[Java]View Plaincopyprint?
  1. Class MyObject {
  2. Test main; //Record test object, used for recovery of accessibility in finalize
  3. Public MyObject (Test t) {
  4. main = t; //Save test object
  5. }
  6. protected Void Finalize () {
  7. Main.ref = this ; //Restore this object, so that this object can reach
  8. System.out.println ("This is Finalize"); //For test finalize run only once
  9. }
  10. }
  11. Class Test {
  12. MyObject ref;
  13. public static void Main (string[] args) {
  14. Test test = new test ();
  15. Test.ref = new MyObject (test);
  16. Test.ref = null; ///MyObject object is unreachable, finalize will be called
  17. System.GC ();
  18. if (test.ref! = null)
  19. System.out.println ("My object is still alive");
  20. }
  21. }

Operation Result:

This is finalize

My object is alive.

In this example, it is important to note that although the MyObject object becomes reachable in finalize, finalize is no longer called when the next collection is made because the Finalize function is called only once.

Seven, GC program interaction

How does the program interact with the GC? JAVA2 enhanced the memory management function, added a java.lang.ref package, see my open source project: src-jdk1.7.0_02

There are three types of reference classes defined. The three reference classes were: SoftReference, WeakReference, phantomreference

By using these reference classes, programmers can interact with the GC to some extent in order to improve the productivity of the GC, which has a reference strength between the unreachable object and the unreachable object.

It is also very easy to create a reference object, for example: if you need to create an soft reference object, first create an object and use normal referencing (up to the object), then create a softreference reference to the object, and finally set the normal reference to NULL. In this way, this object has only one soft reference reference. At the same time, we call this object the soft Reference object.

The main feature of Soft reference is its strong reference function. This kind of memory is only recycled when there is not enough memory, so when memory is sufficient, they are usually not recycled. In addition, these reference objects are guaranteed to be set to NULL before Java throws a OutOfMemory exception. It can be used to implement the cache of some commonly used images, realize the function of the cache, and ensure the maximum use of memory without causing outofmemory. The following is a pseudo-code for this type of reference:

[Java]View Plaincopyprint?
  1. Request an Image Object
  2. Image image=new image (); //Create an Image object
  3. ...
  4.  Using the image
  5. ...
  6.  After using the image, set it to the soft reference type and release the strong reference;
  7. SoftReference sr=New SoftReference (image);
  8. image=null;
  9. ...
  10.  Next time you use
  11.  if (sr!=null)
  12. Image=sr.get ();
  13.  else{
  14. image=new Image (); //Because the GC has been released because of low memory, the image needs to be reloaded;
  15. sr=New SoftReference (image);
  16. }

The biggest difference between the weak reference object and the soft reference object is that the GC will need to check whether the soft reference object is reclaimed by the algorithm when it recycles, and the GC is always recycled for weak reference objects. Weak reference objects are more easily and faster to be recycled by GC. Although GC must recycle weak objects at run time, the weak object groups of complex relationships often require several GC runs to complete. Weak reference objects are often used in the map structure to refer to objects with a large amount of data, and once the strong reference to the object is NULL, the GC can quickly reclaim the object space.

Phantom references are less useful and are primarily used to assist in the use of finalize functions. The Phantom object refers to objects that have completed the Finalize function and are unreachable objects, but they have not yet been recycled by GC. This object can assist finalize with some later recycling, and we enhance the flexibility of the resource recovery mechanism by overriding the reference clear () method.

Viii. Java Programming recommendations

Based on how the GC works, we can make the GC run more efficiently and meet the requirements of the application in a number of tricks and ways. Some suggestions on programming:

1) The most basic suggestion is to release the reference of the useless object as soon as possible. Most programmers use temporary variables to have the reference variable automatically set to null after exiting the active domain (scope). When we use this approach, we must pay special attention to complex object graphs, such as arrays, queues, trees, graphs, etc., which are more complex to refer to each other. For such objects, GC recycling is generally less efficient. If the program allows, the unused reference object will be assigned null as soon as possible, which can speed up the work of the GC.

2) Use the Finalize function as little as possible. The Finalize function is a chance that Java provides programmers with an opportunity to release objects or resources. However, it will increase the amount of GC work, and therefore minimize the use of a Finalize method to reclaim resources.

3) If you need to use a frequently used picture, you can use the Soft app type. It can save the picture in memory as much as possible for the program to call, without causing outofmemory.

4) Note the collection data types, including arrays, trees, graphs, linked lists and other data structures, which are more complex for GC to recycle. Also, be aware of some global variables, as well as some static variables. These variables tend to cause hanging objects (dangling reference), causing memory waste.

5) When the program has a certain wait time, the programmer can manually execute System.GC () to notify the GC to run, but the Java language specification does not guarantee that the GC will be executed. Using incremental GC can shorten the pause time for Java programs.

Java memory model and GC principles

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.