This digest is a sample of Garbage Collection Handbook, which will be published a few weeks later. It also allows you to familiarize yourself with the basics of garbage collection-the first chapter of the book.
At first glance, garbage collection should be the name of what it is-find and clean up garbage. In fact, it is the opposite. Garbage collection keeps track of all objects that are still in use, and then marks the remaining objects as garbage. With this in mind, let's take a closer look at how automated memory recycling, known as garbage collection, is implemented in the JVM.
Manage memory manually
Before introducing the modern version of garbage collection, let's briefly review those days when you need to manually explicitly allocate and release memory. If you forget to release the memory, then this memory will not be reused. This piece of memory is occupied and not used. This scenario is called a memory leak .
Here is a simple example of manual memory management written in C:
int Send_request () {size_t n = Read_size (); int *elements = malloc (n * sizeof (int if (read_elements (n, elements) < N) { /span>// elements not freed! return -1; // ... free (elements) return 0
As you can see, it's easy to forget to free up memory. Memory leaks used to be a very common problem. You can only fight against them by constantly repairing your own code. Therefore, there is a need for a more elegant way to automatically free up useless memory in order to reduce the likelihood of human error. This automation process is also known as garbage collection (GC).
Smart pointers
One of the early implementations of automatic garbage collection is reference counting. You know that each object has been quoted several times, and when the counter is 0, the object can be safely reclaimed. A shared pointer to C + + is a very well-known example:
int send_request () { = read_size (); Stared_ptr<vector<int>> elements = make_shared<vector<int >> (); if (Read_elements (n, elements) < N) { return -1; } return 0;}
The sharedptr we use will record the number of times the object has been referenced. If you pass it to someone else, Count plus one, and when it leaves the scope it will lose one. Once this count is 0,sharedptr, the underlying vector is automatically deleted. Of course this is just an example, because there are also readers pointing out that this is unlikely to happen in reality, but as a demonstration is enough.
Automatic memory management
In the above C + + code, we also have to explicitly state that we need to use memory management. So what happens if all the objects are using this mechanism? That's so convenient that developers don't have to think about cleaning up memory. The runtime automatically knows which memory is no longer in use, and then releases it. In other words, it automatically recycles the rubbish. The first generation of the garbage collector was introduced in 1959 by Lisp, a technology that has been evolving so far.
Reference count
The idea we've just demonstrated with a shared pointer to C + + can be applied to all objects. Many languages, such as Perl, Python, and PHP, are used in this way. This can be easily explained by a picture:
The green cloud represents the objects that are still in use in the program. Technically, it's kind of like a local variable in a method being executed, or a static variable. Different programming languages may not be the same, so this is not the focus of our attention.
The blue circle represents an in-memory object, and you can see how many objects refer to them. The object of the gray circle is no longer referenced by anyone. Therefore, they are garbage objects that can be cleaned out by the garbage collector.
It looks good, doesn't it? Yes, but there is a major flaw here. It is easy to have some isolated loops in which the objects are not in any domain, but each other's references cause the number of references not to be 0. Here's an example:
See, the red part is actually a garbage object that the application is no longer using. Due to a defect in the reference count, there is a memory leak.
There are several ways to solve this problem, such as using a special "weak" reference, or using a special algorithm to recycle a circular reference. The previously mentioned Perl,python, as well as the PHP language, use a similar method to recycle circular references, but this is beyond the scope of this article. We are ready to detail the methods used by the JVM.
Tag Delete
First, the JVM has a clear definition of the accessibility of objects. It is not as vague as the green cloud before, but as defined by a very clear and specific garbage collection root object (Garbage Collection Roots):
- Local variables
- Active thread
- Static fields
- JNI Reference
- Others (to be discussed later)
The JVM records all of the unreachable (surviving) objects by marking the deleted algorithm, while ensuring that the memory of the unreachable object can be reused. This consists of two steps:
- A tag is a traversal of all the objects that can be reached, and then the information of those objects is recorded in local memory
- Deleting ensures that the memory address of an unreachable object can be used in the next memory allocation.
Different GC algorithms in the JVM, such as Parallel Scavenge,parallel mark+copy, CMS are different implementations of this algorithm, but each stage is slightly different, conceptually still corresponds to the above mentioned two steps.
The most important thing about this implementation is that there will be no more leaked object rings:
The downside is that the application's thread needs to be paused to complete the collection, and you can't count it if the reference keeps changing. This application is paused so that the JVM can clean up the housework and is called Stop the World Pause (STW). There are many possibilities for this pause to be triggered, but garbage collection should be the most common.
Java garbage collection mechanism concept carding