Garbage collector using the mark-sweep algorithm

Source: Internet
Author: User

In the time I wrote the C + + code, I did not write the garbage collector, nor did I realize my own memory allocator, and this article looked a lot. For example, when I write C # code, I do not need to release it, I also understand that there is a garbage collector to help me reclaim those objects on the heap, but the specific implementation did not delve into. This time I suddenly think of a place in the past to read about a small garbage collector implementation of the article, which is a translation of the article, and then found the source code, the author is a friend of Google Company (set almost Ah, like to know). Today Vego decided to analyze this implementation, because in general, the analysis of people more powerful than their works, there will always be some gains.

Source on GitHub, the clues can find him to explain the implementation of the article, click here: Https://github.com/munificent/mark-sweep

For simplicity, we call each block of memory used by a variable an object. We know that the objects on the stack are all disposed of during a function call or after the call is complete, as this is the ABI specification of the operating system, so we do not need to manage stack memory and the garbage collector manages heap memory. How does the garbage collector know which objects can be recycled? Objects on the heap that cannot be accessed directly or indirectly from the stack can be recycled. How do you decide that you cannot access a heap object directly or indirectly from the stack? This is where I used to be confused. For example, I wrote a line in a function: "A * p = new A ();" When a object is produced, it should have a record somewhere, saving the value of P. When the function exits, the value of P in the record is cleaned up, but the memory object that it points to is still present, and when the garbage collector makes a recovery check, it first marks the memory object that can be accessed from the stack as being used, so that the memory object is not freed by the cleanup phase error. For P, because it is cleared in the record, it is inaccessible from the stack at the mark stage, so it is not marked for use, and the object is freed during the cleanup phase. So the question comes again, since there is no access from the stack to the object that P points to, and how to release it? In fact, there is another path can be accessed to it, the garbage collector holds a root object, through which the root object can access all allocated memory objects, but these objects may have no stack on the variable point to it, the garbage collector is based on this as the basis for memory release.

In the author's implementation, there are two structures are more important, is the virtual machine structure, declared as follows:

typedef struct {

object* Stack[stack_max];
int stackSize;

/* The first object in the linked list of any objects on the heap. */
object* Firstobject;

/* The total number of currently allocated objects. */
int numobjects;

/* The number of objects required to trigger a GC. */
int maxobjects;
} VM;

Objects that can be accessed from the stack are those that can be accessed directly or indirectly from the stack variables in the VM structure, where objects are of type object.

typedef struct SOBJECT {
ObjectType type;
unsigned char marked;

/* The next object in the linked list of heap allocated objects. */
struct sobject* next;

Union {
/* Obj_int */
int value;

/* Obj_pair */
struct {
struct sobject* head;
struct sobject* tail;
};
};

} Object;

There is a next member in the object structure that points to the next object, so there is an indirect access to that statement.

The stack variable in the VM structure determines which objects are still referenced, such as the p mentioned above, which can be accessed from the stack when the object is just assigned to it, but when it leaves the scope and no other variable references the memory object that P points to, it is inaccessible from the stack variable. But before P points to the memory object being released by the garbage collector, the memory object can always be found from the Firstobject in the VM structure body step-by-step.

As the name implies, the mark-sweep algorithm is divided into two stages, mark and sweep stages, and Chinese is the mark and cleanup phase. With this understanding, we understand why this implementation calls the Markall (VM) and Sweep (VM) in turn when the function GC is called.

void GC (vm* VM) {
int numobjects = vm->numobjects;

Markall (VM);
Sweep (VM);

vm->maxobjects = vm->numobjects * 2;

printf ("\ncollected%d objects,%d remaining.\n", Numobjects-vm->numobjects,
Vm->numobjects);
}


Each object object has a tag bit, which is marked as 0 when initialized, indicating that it can be recycled (is it a bit strange to be released by mistake?). )。 I have just started to feel wrong, the new object is not being used, it should be 1. The real idea is this: first think that all objects can be recycled, and then before you recycle, traverse the stack variable, the object from which it can be reached is marked as 1, this is the marking phase, and then perform cleanup, cleanup is no longer traversing the stack variable, but rather traversing the firstobject variable. For those variables which traverse the stack variable because there is no direct or indirect reference to it, traversing firstobject in the cleanup phase can find it, and finds that it is marked as 0 and then frees it. For those variables labeled 1, reset it to 0 back to the initial state, which means recyclable, waiting for the next GC call.


Garbage collector using the mark-sweep algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.