Visual garbage collection algorithm

Source: Internet
Author: User

Links: blog.jobbole.com/77280/

Github.com/kenfox/gc-viz

Most developers think of the automatic garbage collector as a matter of course. In fact, this is just a practical feature of the language runtime, designed to simplify our development efforts.

But if you try to understand the internal workings of the garbage collector, you'll find it hard to figure out. Unless you're familiar with its workflow and error handling, you'll be overwhelmed by the thousands of implementation details inside.

I compiled a tool that has five different garbage collection algorithms. The program will create an animated interface when it runs. You can get animations and code from Github.com/kenfox/gc-viz. To my surprise, this simple animation shows these important algorithms.

Cleanup after task completion: aka No GC

The simplest way to clean up garbage is to wait for one task to complete and dispose of all the rubbish at once. This technique is very useful, especially if a task can be broken down into many small tasks. For example, the Apache network server creates a small pool of memory on each request and completely frees the entire pool of memory that is created after the request is completed.

Animation shows a program that is running. The entire picture represents the memory area of the program. The memory area is black at the beginning, and black indicates that memory has not been used. Areas flashing bright green and yellow indicate that the memory area is reading and writing. Color changes over time, you can observe the memory usage, and you can see the current activity situation. If you look closely, you will notice that there are areas in the memory area that are ignored during program execution. These areas have become so-called garbage-programs that cannot be accessed and used. Areas of memory that are outside of the garbage zone are available.

The program has plenty of memory, so you don't have to worry about garbage cleanup while the program is running. In the following example I will always use this simple program.

Reference count Collector

Another simple solution is to count the resources you use, which refers to objects in memory, and handle them when the count becomes 0 o'clock. This is a widely used technology that when developers add garbage collection to an existing system-this is the only garbage collector that is easy to integrate with other resource managers and existing code libraries. Apple has issued a logo for objective-c--erase the garbage collector and understand the fact. There are a lot of problems with publishing a product so that they have to discard the feature and replace it with a well-performing automatic reference counting collector.

The animation above shows the same program, but at this point it will process the garbage by counting each object reference in memory. A flashing red indicates the reference count behavior. The advantage of reference counting is that garbage is quickly detected-you can see the red flashes and then the area turns black.

Unfortunately, there are many problems with reference counting. Worst of all, it can't handle looping structures. The loop structure is very common--either an inheritance or a reverse reference will create a loop that will cause a memory leak. The cost of reference counting is also large-from the animation you can see that the red flicker lasts even when memory usage is not growing. The CPU is fast, but memory reads and writes are slow, and the counters are constantly loaded and saved to memory. Updating all of these counters makes it difficult to ensure that data is read-only or thread-safe.

The reference count is an allocation algorithm (which costs the entire program to run), but this is an allocation algorithm that has contingency and cannot guarantee reaction time. For example, there is a large tree structure in the program. The last program that uses the tree will trigger the processing of the whole tree, and Murphy says that if things get worse, it always happens, no matter how small the possibility. There is no other allocation algorithm, so the accidental characteristics of apportionment may depend on the data. (All of these algorithms have concurrent or partially concurrent commands, but these are beyond the scope that the program can demonstrate.) )

Tag-Erase collector

Tag-Erase eliminates some of the problems with reference counting. It can easily solve the problem of circular structure in the reference technology, because the system overhead is low because there is no need to maintain the count.

The algorithm discards the real-time of garbage detection. In the animation, there is a run time without any red flashes, and then suddenly a lot of red flashes indicate that the active object is currently being tagged. After the tag is complete, the program traverses the entire memory space and processes the garbage. You will also notice in the animation that many areas turn black immediately, rather than as the reference count does, and slowly darken over time.

Tag-erase requires a higher consistency implementation than reference counting and is difficult to port into an existing system. In the tagging phase, you need to traverse all the active data, even the data encapsulated in the image. If an object does not support traversal, it is too risky to attempt to port the tag-erase to the code. Another disadvantage of mark-erase is that the erase phase must traverse the entire memory to find garbage. This is not a problem for a system that produces less garbage, but now the functional programming style produces a lot of garbage.

Tag-Compression collector

In the previous animation you may notice that the object never moves. Once the object is allocated in memory, the object's storage location will no longer change, even if it is surrounded by fragments of memory in the black area. The following two algorithms change this behavior in a completely different way.

The tag-compression algorithm does not handle memory only by marking whether the memory area is idle, but by moving the object to a free table. Objects are typically stored in memory order, with objects allocated in the low address space of memory-but the vacancy caused by processing objects becomes larger as the object moves.

Moving an object means that the new object can only be created at the end of the used memory. This is called the "BUNP" allocator, like a stack splitter, but does not limit stack space. Some systems that use the bump allocator do not even store data on the call stack, they only allocate call frames in the heap and treat them like other objects.

Sometimes the theory is higher than the practice, and another advantage is that when the object is compressed, the program accesses the memory like a hardware cache. Not sure if you can see this benefit--although the reference count and tag-erase memory allocator used is complex, debugging works well and is very efficient.

Tag-compression is a complex algorithm that requires multiple traversal of all allocated objects. In the animation, you can see that the active object immediately following the flashing red is computed for the purpose of a large number of read and write markers, the object is moved, and the final reference is pinned to the moved object. The biggest advantage behind this complex program is that the memory overhead is very small. Oracle's hotspot JVM uses a variety of different garbage collection algorithms. Instead, the global object space uses the tag-compression reclamation algorithm.

Copy Collector

The last algorithm used for animation is the foundation of most high-performance garbage collection systems. It is the same as tag-compression, which is the same as moving the collector, but it is very simple to implement. It uses two memory spaces and alternately replicates the active object between two memory. In fact, more than two blocks of space are used for different generations of objects, new objects are created in one space, and if the life cycle is not ended it is copied to another space, and if long-term existence is copied to a permanent space. If you hear of a garbage collector that is generational or ephemeral, it is usually a multi-space copy collector.

In addition to simplicity and flexibility, the main advantage of the algorithm is that it takes time to spend on the active object. There is no independent marking phase that must be erased or compressed. During the traversal of the active object, the object is immediately copied to compensate for the shortcomings of the previous object in reference counting.

In the animation, you can see that all the data in the recycling process is copied from one space to another. This is a bad situation for the algorithm, which is one reason people talk about optimizing the garbage collector. If you can adjust the memory and have optimized allocations so that most of the objects are discarded before recycling begins, you can take into account the security functional programming style and high performance.

(Note: Limited to the level of translators, the shortcomings of the request correct.) )

Visual garbage collection algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.