GC and memory leaks in Java

Source: Internet
Author: User

Since its inception, more than 20 years ago, Java is still the most widely used language. This relies on the various technologies and features provided by Java, allowing developers to gracefully write efficient programs. Today, let's talk about a basic, but very important, technology in Java. Memory Management

Understand the C language of the students are aware that in the C language, the opening and release of memory is managed by ourselves, each new operation for a delete operation, otherwise the parameters of memory leaks and overflow problems, resulting in very bad consequences. However, in the Java development process, there is absolutely no need to worry about this problem. Because the JVM provides a mechanism for automatic memory management. The work of memory management is done by the JVM for us. So we don't have to worry about freeing up memory.

Analysis of JVM Memory

While the JVM does the work of memory management, we still need to know what the JVM is doing, so let's go and see

The JVM starts off with a series of work, one of which is to open up a piece of runtime memory. And this piece of memory is divided into five large areas, respectively, for different functions.

Program counter

The address of the next instruction to be run by the logger, where the "address" can be either a local pointer or an offset in the method bytecode relative to the starting instruction of the method. If the thread is executing a local method, the value of the program counter at this point is "undefined". In a multithreaded environment, each thread has its own program counter, which saves the current thread's program counter to the snapshot when the JVM dispatches the thread, so that the next time the thread gets the execution

VM Stack

The virtual machine stack is the memory model that the Java method executes, and when each method executes, it creates a frame in the stack to store the local variable table, the operand stack, the dynamic link, and the method exit. When the method starts the call, the stack frame is created and the stack is stacked at the end of the method execution. Each thread has its own stack.

Dynamic Link: A symbolic reference that points to a method in a constant pool and needs to be determined as a direct reference at run time

Method Exit: The program counter for the caller of the current execution method, or the address of the exception-handling table

The size of the stack can be configured by-XXS size, and when nested calls are used improperly, it causes the method to go into the stack, resulting in the stack space being fully occupied stackoverflowerror

Local method Stack

Heap

Heaps are places where object instances are stored, and almost all object instances are allocated in the heap. The heap is thread-shared, which is why the synchronization mechanism is multi-threaded.

The heap is the primary area of GC management, and the GC first determines whether the object is dead (objects that cannot be used) before it recycles the heap

There are two algorithms for judging whether an object survives: reference counting algorithm, accessibility analysis algorithm

The reference counting algorithm adds a reference counter for each object, and whenever a reference points to it, the counter adds one, and any object with a counter of 0 at any time cannot be used again. This algorithm is simple to implement, but it is difficult to solve the object circular reference problem (what is the circular reference below remarks)

The Accessibility analysis algorithm is an algorithm that is being used by the Java language. Its basic idea is to use a system called "GC Root" as the starting point, from this point down search, search traversed by the path called the reference chain, when an object no longer any reference chain, it is not possible to reuse this object.

In the Java language, GC root includes the following objects:

    1. Objects referenced in the virtual machine stack
    2. Objects in the local method stack that are referenced by JNI
    3. Objects referenced by class static member variables in the method area
    4. Objects referenced by constants in the method area

You can see that the analysis object is alive, and it's all about references. After JDK1.2, Java extends the concept of references into strong references (strong Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference )

    1. Strong references

A strong reference is a reference in the original sense, and the referenced object is not recycled as long as the strong reference exists

    1. Soft references

The SoftReference class represents a soft reference, and for objects associated with a soft reference, when the system is about to have a memory overflow, these objects are collected in a collection scope and collected two times

    1. Weak references

The WeakReference class represents a weak reference and can only survive until the next garbage collection occurs for objects that are associated with a weak reference

    1. Virtual reference

The Phantomreference class represents a virtual reference, the virtual reference does not affect the lifetime of the associated object, and the object instance cannot be obtained, it is the only effect of receiving a system notification when the object is recycled by GC

The heap size can be controlled by-XMX and-XMS. For the mainstream JVM,GC, the algorithm of generation collection is adopted. Based on this algorithm, the Java heap is divided into the new generation (young Generation) and the older generation (old Generation), the new generation is further divided into the Eden and Survivor District, The last survivor is made up of Fromspace and Tospace. New objects are used to allocate memory in the Cenozoic, when Eden space is insufficient, the surviving objects will be transferred to the survivor, the Cenozoic size can be controlled by-xmn, you can also use-xx:survivorratio to control the proportions of Eden and survivor. The Laosheng is used to store objects that are still alive in the Cenozoic after multiple garbage collection (also known as minor GC).

Immortal Generation (Permanent space) as a method area

Method area

The method area is also shared by the thread, which holds the loaded class information, the static variables, the constants, and the immediate compiler-compiled code. -xx:maxpermsize used to set the method area size

Direct Memory

Direct memory is not part of the data area when the virtual runtime is running. Out-of-heap memory allocated directly through the native library and then manipulated by the Directbytebuffer object stored in the Java heap as a reference to this memory

Memory allocation and recycling policies

so far, the JVM three more mature garbage collection algorithms have been developed: 1. Mark- clearing algorithm; 2. copy algorithm; 3. Mark- sorting algorithm; 4. Generational Collection Algorithms

1. Tag- purge algorithm

This collection of garbage collection is divided into two stages: mark, clear. All objects that need to be reclaimed are marked first, and all tagged objects are reclaimed after the tag is complete. This recovery algorithm generates a large number of discontinuous memory fragments, and when a large object is frequently allocated, the JVM does not find enough contiguous chunks of memory in the Cenozoic, which can lead to frequent memory reclamation by the JVM (there is a mechanism for large objects to be allocated directly to the old age).

2. Replication Algorithms

This algorithm divides the memory into two equal blocks, using only one piece at a time. When this block of memory is not available, copy the surviving objects into another memory and clean up the memory once. This is a high efficiency and avoids memory fragmentation. But in this way, the memory can be used in half of the space, is a very small loss.

3. Tagging- sorting algorithms

This is an upgraded version of the tag-purge algorithm. After the mark-up phase is complete, instead of cleaning up the recyclable object directly, the surviving object moves toward one end and then clears out the memory outside the boundary

4. Generational Collection algorithms

This algorithm is used by the current commercial virtual machines. First, according to the different life cycle of the object, the memory is divided into several blocks namely the Cenozoic, the old age, and then according to the characteristics of different ages, different collection algorithms are adopted. In the Cenozoic, a large number of objects were killed in each garbage collection, and only a few survived, so the replication algorithm was chosen. In the old age, because the object survival rate is relatively high, so the use of marker-collation algorithm (or mark-clear algorithm)

GC the implementation mechanism

Because objects are processed in a generational way, garbage collection areas and times are different. There are two types of GC: Scavenge GC and full GC.

  Minor GC

In general, when a new object is generated and the Eden application space fails, the minor GC is triggered, GC is performed on the Eden Zone, the non-surviving objects are cleared, and the surviving objects are moved to the survivor area. Then tidy up the two districts of survivor. This method of GC is carried out on the young generation of the Eden area and does not affect the old generation. Because most objects start in the Eden area, and the Eden area is not very large, GC in the Eden area is frequent. Thus, it is generally necessary to use fast and efficient algorithms, so that Eden can be free as soon as possible.

  Full GC

Organize the entire heap, including young, tenured and perm. The full GC is slower than the minor GC because it needs to be recycled across the heap, so you should minimize the number of complete GC times. In the process of tuning the JVM, a large part of the work is to adjust the FULLGC. The full GC may be caused by the following reasons:

1. The old generation (tenured) was written full

2. Persistent generation (Perm) is fully written

3.SYSTEM.GC () is displayed call

4. Dynamic changes in the domain allocation policy of the heap after the last GC

Java Common memory leaks

    1. database connections, network connections, IO connections, etc. do not show call Close closes, which can cause memory leaks
    2. The use of listeners can also cause memory leaks when the object is freed without a corresponding delete listener

Java is a garbage-collected language, and developers don't have to deliberately manage memory allocations. However, there is still a lot of memory leaks in Java, if you do not handle the memory leaks, will cause the app memory unit can not be freed and wasted, eventually causing memory to occupy the stack (heap) explosion and then the program crashes .

Memory leaks

When it comes to memory leaks , we have to mention memory overruns , two more confusing concepts that we can analyze.

    • Memory leak : The program is not freed after it has been used after it has been applied to the system to allocate memory space (new). The result is that the memory unit is always occupied, and neither we nor the program can use the memory unit until the end of the program, which is a memory leak.

    • Memory Overflow : The program has requested more memory space than the system can give. For example, memory can only be assigned an int type, but I fortress give him a long type, the system will appear oom. Another example is a car can sit up to 5 people, but you are not a fortress under 10, the car is bursting.

A large amount of memory leaks can cause a memory overflow (oom).

Memory

To understand memory leaks, it is essential to understand memory.
Java is run in the virtual memory environment of the JVM, and the JVM's memory can be divided into three extents: heap, stack (stack), and method area.

    • Stack: is a simple data structure, but is widely used in computers. The most notable features of the stack are:LIFO (last in, first out, LIFO). For example, we put clothes in the box, first put in the bottom, only to take out later put in to get the clothes below. Only the base type and object references (not objects) are stored in the stack.

    • heap: heap memory is used to store objects and arrays created by new . The memory allocated in the heap is managed by the Java Virtual Machine automatic garbage collector. Only one heap of JVM (heap) is shared by all threads, and the heap does not hold basic types and object references, only the object itself.

    • method Area: Also called the static zone, like the heap, is shared by all threads. The method area contains all class and static variables.

After the memory concept is probably understood, the question to consider is:
Where exactly is the memory that will let us cause a memory leak?

Memory leak cause Analysis

In Java, the stack of the JVM records the invocation of the method, with each thread owning a stack. As the thread runs, it executes to a new method call, adding a memory unit, the frame, to the stack. In a frame, the parameters, local variables, and return addresses of the method call are saved. However, a local variable in Java can only be a primitive type variable (int), or a reference to an object. Therefore, only the base type variables and object references are stored in the stack. the referenced object is saved in the heap.

When a method runs at the end, the corresponding frame of the method is removed from the stack, and the space occupied by all local variables and parameters in the frame is freed. The thread goes back to the original method and executes, and when all the stacks are emptied, the program runs to the end.

For heap memory, the heap stores common variables. In Java, heap memory is not emptied with the end of the method, so local variables are defined in the method, and the variables remain in the heap after the method ends.

In summary, stacks can clean up unused memory space on their own. But if we keep creating new objects, the heap's memory space will be exhausted. So Java introduced garbage collection (garbage collection, referred to as GC) to deal with heap memory recycling, but if the object has been referenced can not be recycled, resulting in a waste of memory, can no longer be used. So objects can not be recycled by GC is the cause of memory leaks!

Garbage collection mechanism

Garbage collection (garbage collection, referred to as GC) automatically empties objects that are no longer used in the heap. In Java, objects are used by reference. If no reference is directed to the object, the object is not processed or called, and the object is called unreachable (unreachable). garbage collection is used to free up memory occupied by unreachable objects.

Realization idea: We define the stack as root, traverse the references of all the objects in the stack, and then traverse through the objects in the heap. Because the references to the objects in the stack are deleted, we can find objects that are not pointed to in the heap by reference to the objects in the stack, which are not reachable objects and are garbage collected.


Idea of garbage collection realization

If you hold a strong reference to an object, the garbage collector is unable to reclaim the object in memory.

Reference type

In previous versions of JDK 1.2, if an object was not referenced by any variable, the program could no longer use the object. That is, only the object is in the accessible (reachable) state before the program can use it. Starting with JDK version 1.2, the reference to the object is divided into 4 levels, giving the program more flexibility in controlling the object's life cycle. These 4 levels are high to low in order: Strong references, soft references, weak references, and virtual references.
Java/android reference types and their usage analysis

1. Strong references (strong reference)
One of the most common reference types in actual coding. Common forms such as: a A = new A (); The strong reference itself is stored in the stack memory, and its storage points to the address of the object in memory. In general, when a strong reference to an in-memory object no longer points to it, the garbage collection machine begins to consider a possible garbage collection of this memory. For example, when encoding: a = null, at this point, there is no other reference to the new a object that has just been assigned an address in the heap, and the heap memory is garbage collected when the system is garbage collected.

2. Soft reference (Soft Reference)
The general use form of soft references is as follows:

A = new A ();
softreference<a> SrA = new softreference<a> (A);

The following two conditions are required for garbage collection of objects indicated by soft references:
1. When its indicated object does not have any strong reference to the object pointing to it;
2. When the virtual machine is low on memory.
As a result, the SoftReference in disguise extends the time it indicates that the object occupies heap memory until the garbage collector reclaims this heap memory space until the virtual machine has insufficient memory.

3. Weak references (Weak Reference)
Similarly, the general use of soft references is as follows:

A = new A ();
weakreference<a> WrA = new weakreference<a> (A);

WeakReference does not change the garbage collection time of the original strongly referenced object, and once it indicates that the object does not have any strongly referenced objects, the object enters the normal garbage collection process.

4. Virtual references (Phantom Reference)
Compared with softreference or weakreference, the main differences of Phantomreference are as follows:
1.PhantomReference has only one constructor

Phantomreference (T referent, referencequeue<? super T> Q)

2. Whether or not a strong reference is directed to the Phantomreference object, the Phantomreference Get () method returns the result null.

Therefore, the use of phantomreference must be combined with referencequeue;
As with WeakReference, Phantomreference does not change the garbage collection timing of its indicated objects.

Memory Leak Reason

If you hold a strong reference to an object, the garbage collector is unable to reclaim the object in memory.

The true cause of memory leak is: holding the object's strong reference, and not released in time, resulting in the memory unit has been occupied, wasted space, may even cause memory overflow!

In fact, there are two different scenarios in which memory leaks can occur in Android:
    • The static variable of the global process (Process-global). This ignores the state of the application and holds the activity's strong reference to the monster.
    • Threads that live outside the activity life cycle. A strong reference to the activity was not emptied.

Check your project for the following scenarios:

      • Static Activities
      • Static views
      • Inner Classes
      • Anonymous Classes
      • Handler
      • Threads
      • TimerTask
      • Sensor Manager

GC and memory leaks in Java

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.