. NET garbage collection

Source: Internet
Author: User
Tags compact

Catalogue

L Introduction

L About garbage Collection

L Garbage Collection algorithm

M application root (Application Roots)

L Implementation

M Stage I: Mark (Mark)

M Phase II: Finishing (Compact)

L End (finalization)

L Optimization of garbage collection performance

M weak references (WeakReference)

M generation (generations)

L Myths related to garbage collection

Introduction

Microsoft claims. NET is a revolutionary programming technique. Many of the elements make it a top choice for most developers. In this article we are going to discuss one of the main advantages of the. NET framework-the ease of memory and resource management.

about the garbage collector

Each program uses a certain sequence of resources, or memory buffers, or network connections, or database resources, and so on. In fact, in an object-oriented environment, each type is considered to be some valid resource for the program. In order to use these resources, you must allocate some memory to describe this type.

Resource access follow the steps:

1. Allocates memory for the type to describe the resource.

2. Initializes the resource, sets the resource to the initialized state, and makes the resource available.

3. Use resources (repeat on demand) by accessing members of the type instance.

4. Destroys the resource state to clear the resource.

5. Frees memory.

. The garbage collector (GC) in net completely helps developers to free themselves from tracking memory usage and determining when to release.

The Microsoft®.net CLR (Common language runtime) requires that all resources be allocated from the managed heap. You do not need to release objects from the managed heap-objects will be freed automatically when the application no longer needs those objects.

The memory is not infinite. The garbage collector needs to recycle to free up memory. The optimization engine of the garbage collector chooses the best payback time for the allocations that have been made (the exact standard is provided by Microsoft). When the garbage collector performs a collection, it first finds out which objects in the managed heap are no longer used by the application and then performs the appropriate operations to reclaim the memory space for those objects.

However, for automatic memory management, the GC must know the seat of the root. In other words, it should know when an object is no longer used by the application. In. NET, the GC is known through a thing called metadata: Each data type used in NET is described by metadata. With the help of metadata, the CLR knows the layout of each object in memory and helps the GC in the garbage collection phase. Without this information, the GC will not know where an object ends and where the next begins.

garbage Collection Algorithmapplication root (application Roots)

Each application has a set of roots (Roots). The root identifies the storage location, which either points to an object of a managed heap, or to an empty object (null).

Like what:

L All global and static object pointers in an application.

L A pointer to all local variable/parameter objects in a thread stack.

An object pointer to all CPU registrations in the managed heap.

L freachable An object pointer in the queue.

The table of the active root is maintained by the JIT compiler and the CLR, and the algorithm for the garbage collector is access.

Implement

. NET garbage collection is implemented with trace recycling, and the exact CLR implements the mark (Mark)/Defragment (copact) collector.

This method consists of the following two stages:

Phase I: Mark (Mark)

Find the memory that can be recalled.

When the GC starts running, it assumes that all objects in the heap are garbage. In other words, it assumes that the root of the application does not point to any objects in the heap.

Phase I contains the following steps:

1. The GC identifies the reference or application root of the surviving object.

2. Walk from the root and build a graph of all the objects that can be traversed from the root.

3. If the GC is ready to attempt to add an object that is already in the diagram, it stops the traversal of the path. This is done for two purposes, and the first one is great for optimizing performance because it does not traverse a set of objects more than once. The second is to prevent the loop from looping when there is an object's loop connection list, so the loops are effectively controlled.

Once all the roots have been checked, the garbage collector's diagram contains all the objects that can be traversed from the application root. Any objects that are no longer in the diagram cannot be accessed by the application, which is known as garbage.

Stage II: Finishing (Compact)

Move all surviving objects to the end of the heap and empty the space at the top of the heap.

Phase II consists of the following steps:

1. The GC now traverses the heap linearly, looking for blocks of nearby garbage objects (now considered free space).

2. The GC then moves the non-garbage object in memory down, removing all the gaps in the heap.

3. Moving objects in memory causes the object pointer to become invalid. So the GC needs to modify the application's root so that the object's pointer points to the new location.

4. In addition, if the object contains a pointer to another object, the GC will also be responsible for correcting the pointers.

After all the garbage has been identified, all non-junk objects are also sorted, and all non-spam pointers are corrected, and the last non-spam pointer points to the location of the next object being added.

end (finalization)

The. NET Framework's garbage collector can secretly track the life cycle of an object created by an application, but is powerless when it encounters an object that wraps an unmanaged resource, such as a file, window, or network connection.

Once the application is no longer using those unmanaged resources, it needs to be displayed to release them. The. NET Framework provides the finalization method for objects: When the garbage collector reclaims the memory of this object, it must execute this method of the object to clear its unmanaged resources. Because the default Finalize method does nothing, you must override this method if you need to display a cleanup resource.

It is not surprising if a finalize method is considered just another name for the destructor in C + +. Although they are given the task of releasing the resources that the object occupies, they have very different semantics. In C + +, the destructor is called immediately when the object is pushed out of scope, and the Finalize method is called when the object is purged by the startup garbage collection.

. NET, the work of garbage collection becomes more complicated because of the presence of finalizers (Finalizer) because it adds a lot of extra action before releasing the object.

Whenever a new object is allocated on the heap that contains a finalize method, a pointer to that object is added to the internal data structure called the finalization queue. When an object cannot be traversed again, the GC considers the object to be garbage. The GC first scans the finalization queue for pointers to these objects, and when the pointer is found, it is removed from the finalization queue and added to another internal data structure called the freachable queue so that the object is no longer part of the garbage. At this point, the GC finishes determining the garbage. Then the (Compact) recoverable memory is cleaned up by a dedicated thread that empties the freachable queue and executes the object's Finalize method.

When the second garbage collector is triggered, it considers the object that is being finalized (Finalize) as a real garbage, and then simply frees up their memory.

So when an object needs to be terminated, it dies first, then survives (resurrected), and then dies again and eventually. It is recommended to avoid using the Finalize method unless necessary. The Finalize method increases the memory pressure because the memory and resources consumed by the object are freed until two garbage collection is started. So you can't control the order in which the Finalize method executes two times, which can lead to unpredictable consequences.

optimization of garbage collection performance

L Weak references (weakreference)

L-generation (generations)

Weak references (weakreference)

Weak references (weakreference) are a way to improve performance to reduce the pressure on large objects in the managed heap.

When a root points to an object, it is called a strong reference to the object and the object cannot be recycled because the application can traverse to the object.

When an object has a weak reference to it (weakreference), it basically means that if there is a memory request and the GC starts, the object can be recycled, and the access will fail when the application attempts to access the object again. On the other hand, in order to access an object that is weakly referenced (WeakReference), the application must obtain a strong reference to the object. If the application obtains its strong reference before the garbage collector reclaims the object, the GC will not be able to reclaim the object because a strong reference to the object exists.

The managed heap contains two internal data structures that manage weak references (WeakReference): A short weak reference table and a long weak reference table.

Two types of weak references:

L Short Weak reference does not track recovery.

In other words, an object with a short weak reference is immediately retracted without waiting for the Finalize method to run.

L Long weak reference tracks recovery.

That is, the GC reclaims the object only when the storage space of the object in the long weak reference table is recoverable. If the object has a Finalize method, it is after the Finalize method has been called and the object cannot be resurrected.

These two tables simply hold pointers to objects allocated in the managed heap. Initially, two tables are empty. When you create a weak reference (WeakReference) object, the object is not allocated from the managed heap. Instead, an empty storage location is allocated in a weak reference table, short weak references use short weak reference tables, and long weak references use long weak reference tables.

Let's look at an example to see what happens when the GC runs. The following figure (Figure 1 and Figure 2) shows the status of all the internal data structures before and after the GC is run.

Figure 1: Before the GC runs

Figure 2: After the GC is run

The following are the actions performed by the GC Runtime:

1. The GC builds a graph of all the objects that can be traversed. In the example above, the graph contains the object b,c,e,g.

2. GC scans short weak reference tables. If the object pointed to in the table is not in the diagram, the pointer identifies an object that cannot be traversed, and this position in the short weak reference table is set to NULL. In the above example, the position of Object D is set to NULL, because it is not part of the diagram.

3. GC scans the finalization queue. If the pointer in the queue refers to an object that is not in the diagram, the pointer identifies an object that cannot be traversed, and the pointer moves from the finalization queue to the freachable queue. At this point, the object is considered to be ergodic, so it is added to the diagram. In the example above, the object a,d,f is not included in the diagram but is considered a ergodic object because they belong to the finalization queue. The finalization queue is then emptied.

4. GC Scan long weak reference table. If the pointer in the table refers to an object that is not in the diagram (now the diagram includes the object referred to by the pointer in the freachable queue), then the pointer identifies an object that is not traversed, and the position is set to null. Since both objects C and F are contained in the diagram, both pail null.

5. GC Finishing (Compact) memory, extruding the voids left by non-traversing objects. In the example above, the object H is the only object that is removed from the heap, and the memory allocated by it is retracted.

generation (Generations)

Because garbage collection does not complete until the entire program is stopped, they may be interrupted for any length of time during program execution. GC can also disrupt events that require timely response to meet the needs of real-time systems.

A feature called generation (generations) in the GC is designed specifically to improve performance. A multi-generational GC is obtained by careful analysis of two facts by observing most of the programs written in various languages:

1. The newly created object has a shorter life cycle.

2. The older the object, the longer it survives.

The multi-generational collector divides it into groups by age of objects, and younger objects are recycled more frequently than older objects. When initialized, the managed heap does not contain any objects. All new objects are added to the No. 0 generation heap until the heap is full and the garbage collection is triggered. Because most objects survive for a short time, only a small percentage of young objects survive the first time they are recycled. Once an object survives after the first collection, it is promoted to the 1th generation. After garbage collection, you can say that the new objects are in the No. 0 generation heap. Garbage collection will only be triggered again if the No. 0 generation heap is full. All No. 0 Generation surviving objects were collated and promoted to the 1th generation. Then the No. 0 generation does not contain any objects, but all new objects enter the No. 0 generation.

As a result, objects that are "mature" (surviving in a multi-generation collector) in the current generation are moved to the next older generation. The 2nd generation is the largest generation supported by the CLR's GC. After recycling, the 2nd generation of surviving objects will simply stay in the 2nd generation.

Therefore, partitioning the heap into objects and reclaiming and collating objects in younger generations improves the efficiency of the garbage collection algorithm because it reclaims a lot of meaningful space from the heap and is much faster than the collector checking all objects in all generations.

A GC that performs multiple generations of recycling, each time it takes to ensure (at least as much as possible) the time required to be less than a certain maximum time, to help provide some real-time operation for the real-time environment, and also to prevent the user from the obvious sense of interruption phenomenon.

myths related to garbage collection

GC is obviously slower than manual memory management

the should explain : not necessarily. The modern garbage collector looks as fast as running and manually storing allocations (Malloc/free or new/delete). In some special programs, garbage collection may not run as fast as custom memory allocations specifically designed for users. But on the other hand, the extra generation added to make manual memory management work correctly (for example, the reference count shown) is often much more expensive than the GC does.

GC will interrupt the program

Explanation: because the garbage collector usually stops the entire application when it finds and reclaims garbage objects, they can cause the user to be aware of the long outage time. But by optimizing the count now, these perceived interrupts can be completely avoided.

Manual memory management does not cause disruption

Explanation: Manual memory management does not ensure performance. It may be interrupted due to a large amount of allocations or free memory work.

Programs that use GC are large and bloated; GC is not suitable for small programs or systems

Correspondence Explanation: Although using GC in complex systems is advantageous, there is no reason to think that the GC will introduce much overhead in other size programs.

I've heard that the GC will use two times a lot of memory

Correspondence Explanation: This may be a fact for the original GC, but not the garbage collector. The data structures used for GC are much larger than those of manual memory management.

. NET garbage collection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.