About garbage collection

Source: Internet
Author: User
"It is hard to believe that Java can be as fast as C ++, or even faster ."

According to my own practice, this statement is true. However, I also found that many doubts about speed come from some early implementation methods. Because these methods are not particularly effective, there is no model for reference and cannot explain why Java is fast.

I think of speed in part because of the C ++ model. C ++ focuses its main focus on all the tasks that occur during the "static" compilation, so the runtime version of the program is very short and fast. C ++ is also directly built on the C model (mainly for backward compatibility), but sometimes it only works in a specific way in C, therefore, it is also the most convenient method in C ++. One of the most important cases is the memory management method of C and C ++. It is an important basis for some people to think that Java is slow: in Java, all objects must be created in the memory "heap.

In C ++, objects are created in the stack. This can speed up, because when we enter a specific scope, the stack pointer will move down a unit, allocate storage space for all stack-based objects created in that scope. When we leave the scope (after all the local constructors are called), the stack pointer moves up to a unit. However, creating a "Memory heap" (HEAP) object in C ++ is usually much slower because it is built on the memory heap of C. This memory heap is actually a large memory pool, which requires recycling (regeneration ). After calling delete in C ++, the released memory will leave a hole in the heap. Therefore, when calling New, the storage allocation mechanism must be searched in some form, match the Object Storage with any existing holes in the heap. Otherwise, the storage space of the heap will soon be used up. The reason why memory heap allocation has such a significant performance impact on the performance in C ++ is that the search for available memory is an important reason. Therefore, creating stack-based objects is much faster.

Similarly, since so much of the C ++ work is carried out during compilation, you must consider this aspect. However, in some parts of Java, the occurrence of things is much more "dynamic", and it will change the model. When an object is created, the use of the garbage collector has a significant impact on the speed of object creation. On the surface, this seems a bit strange-the release of a bucket may affect the allocation of a bucket, but it is one of the important methods adopted by JVM, this means that allocating storage space for heap objects in Java is almost as fast as creating a bucket in the stack in C ++.

You can think of a C ++ heap (and a slower Java heap) as a courtyard where each object has its own ground. In the future, such "Real Estate" will be abandoned and must be regenerated. However, in some JVMs, Java stacks work in quite different ways. It is more like a conveyor belt: every time a new object is assigned, it will move forward. This means that the object storage space can be allocated very quickly. The "heap pointer" is simply moved forward to the virgin land, so it is almost identical with the stack allocation method of C ++ (of course, it will overhead data records, but it is much faster than the search bucket ).

Now, you may notice that the heap is not a conveyor belt. For example, if you treat it in that way, you will eventually need to perform a large number of page exchanges (this will cause huge interference to the performance). In this way, you will eventually use up the memory and encounter a paging error in the memory. So here we must take a trick, that is, the famous "Garbage Collector ". It also compresses all objects in the heap while collecting "garbage", and moves the "heap Pointer" to the place as close as possible to the beginning of the conveyor belt, away from the occurrence (memory) the location of the paging error. The garbage collector will reschedule everything into a high-speed, unrestricted heap model and easily allocate storage space.

To grasp how it works, we first need to understand the working solutions adopted by different garbage collectors (GC. A simple but slow GC technique is reference counting. This means that each object contains a reference counter. When a handle is connected to the same object, the reference counter is added. When a handle exceeds its own scope or is set to null, the reference count is impaired. In this way, as long as the program is running, reference count management needs to be performed continuously-although the management itself has less overhead. The garbage collector moves around the entire Object List. Once it finds that one of the reference counts is 0, it releases the storage space it occupies. However, this method also has a disadvantage: if the object is repeatedly referenced between each other, even if the reference count is not 0, it may still be a "garbage ". To find this self-referenced group, the garbage collector is required to perform a lot of extra work. Reference counting is a type of garbage collection, but it does not seem suitable for all JVM solutions.

In a faster solution, garbage collection is not based on reference count. On the contrary, they are based on the principle that all non-deadlocked objects can eventually be traced back to a handle, which either exists in the stack or in a static bucket. This trace may experience several layers of objects. Therefore, if you start from the stack and static storage areas and go through all the handles, you can find all the active objects. For each handle you find, you must track the object to which it points, and then follow all the handles in that object to track and catch up with the objects they point ...... Wait until the entire linked network initiated from the handle in the stack or static storage area is traversed. Every object that has been moved halfway must be in the active state. Note that the aforementioned problems will not occur for those special self-Reference Groups. Because they cannot be found at all, they are automatically treated as garbage.

In the method described here, JVM adopts an "Adaptive" garbage collection solution. For the activity objects it finds, the specific operation depends on what variants are currently in use. One of the variants is "Stop and copy ". This means that, for some very obvious reasons in the near future, the program will first stop running (not a background collection solution ). Then, each activity object found will be copied from one memory heap to another, leaving all the garbage. In addition, as objects are copied to the new heap, they are focused one by one. This makes the new heap more compact (and allows the new storage area to be easily pulled from the end, as described above ).

Of course, when an object is moved from one place to another, all the handles (references) pointing to that object must be changed. The handles obtained by tracking the objects in the memory heap and those static storage areas can be changed immediately. However, during the "traversal" process, other handles pointing to this object may also be encountered. Once this problem is found, it is immediately corrected (you can imagine that a hash map old addresses to new addresses ).

There are two problems that make the replication collector inefficient. The first problem is that we have two heaps, and all the memory is moved back and forth in these two independent heaps, which requires two times as much management as actually needed. To solve this problem, some JVM allocates the memory heap as needed and simply copies one heap to another.

The second problem is replication. As programs become more and more "robust", they generate almost no or produce little garbage. Even so, a replica collector still copies all the memory from one place to another, which is a waste. To avoid this problem, some JVMs can detect whether new garbage is generated and change to another solution (this is the reason for "Adaptive ). Another solution is "tag and clear", which Sun's JVM always uses. For conventional applications, marking and clearing appear very slow, but once you know that you do not produce garbage, or only produce a small amount of garbage, it will be very fast.

Tag and clear use the same logic: Starting from the stack and static storage areas, and tracking all handles to find active objects. However, each time an activity object is found, a tag is set to mark the object ". But the object is not collected yet. The clearing process starts only when the marking process ends. During the cleanup process, the deadlocked object will be released, but it will not be copied in any form. Therefore, if the collector decides to compress a intermittent memory heap, It will be implemented by moving the surrounding objects.

"Stop and copy" shows us that this type of garbage collection is not performed in the background; on the contrary, once a garbage collection occurs, the program stops running. In Sun's document library, we can find that garbage collection is defined in many places as a low-priority background process, but it is just a theoretical experiment and cannot work at all. In practice, sun's garbage collector runs when memory is reduced. In addition, "tag and clear" also requires the program to stop running.

As mentioned earlier, in the JVM described here, memory is allocated in large blocks. If a large block header object is assigned, it obtains its own memory block. Strict "Stop and copy" requires that each active object be copied from the source heap to a new heap before the old heap is released, which will involve a lot of memory conversion work. Through memory blocks, the garbage collector can usually use dead blocks to copy objects, just as it does during collection. Each block has a generated count to track whether it is "alive ". Generally, only blocks created since the previous garbage collection will be compressed. For all other blocks, if they have been referenced from some other places, the generated count will overflow. This is a situation that many short-term and temporary objects often encounter. A complete cleanup operation will be performed cyclically-the objects in the bulk header are not copied (only overflow of their generation count), and those that contain small objects will be copied and compressed. The JVM monitors the efficiency of the garbage collector. If all objects belong to long-term objects and cause garbage collection to be a waste of time, it switches to the "mark and clear" solution. Similarly, the JVM will track the successful "tag and clear" monitoring. If the memory heap becomes increasingly "disorganized", it will switch back to the "Stop and copy" solution. The "Custom" statement is based on this behavior. We will summarize it as follows: "automatically convert the stop and copy/mark and clear modes as needed ".

JVM also uses many other acceleration solutions. One of the most important tasks involves the loader and JIT compiler. If you have to load a class (usually when we first want to create an object for that class), we will find the. Class file and send the bytecode of that class to the memory. At this point, one method is to compile all the code with JIT, but this method has two disadvantages: it will spend more time. If you consider it with the running time of the program, compilation Time may also be longer, and it increases the length of the execution file (the bytecode is much simpler than the extended JIT Code), which may cause Memory Page Swap, this significantly slows down the execution of a program. Another alternative is: Compilation without JIT unless necessary. In this way, code that is not executed at all may never be compiled by JIT.

Since the JVM is external to the browser, you may want to benefit from some JVM speed improvements when using the browser. Unfortunately, the JVM cannot communicate with different browsers. To realize the potential of a specific JVM, either use a built-in JVM browser or only run an independent Java application.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.