Garbage collection in. net

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Garbage collection in. net
Cloud (translation) Source: csdn
Directory
L Introduction

L garbage collection

L garbage collection Algorithm

M application root)

L Implementation

M Phase I: Mark)

M stage II: compact)

L finalization)

L garbage collection performance optimization

Weakreference)

Generation M (generations)

L myth related to garbage collection

Introduction
Microsoft claims. Net is a revolutionary programming technology. Many factors make it the preferred choice for most developers. In this article, we will discuss a very important feature in. NET Framework.

The advantage is the convenience of memory and resource management.

About the Garbage Collector
Each program uses a certain sequence of resources, memory buffers, network connections, or database resources. In fact, in an object-oriented environment, each type is considered

Is some of the program's valid resources. To use these resources, you must allocate some memory to describe this type.

To access resources, follow these steps:

1. allocate memory for the type to describe the resource.

2. initialize the resource and set the resource to the initialization status to make the resource available.

3. Use resources by accessing members of a type instance (repeated as needed ).

4. Destroy the resource status to clear the resource.

5. Release the memory.

The garbage collector (GC) in. Net completely helps developers to get rid of memory usage and determine when to release.

Microsoft. net clr (Common Language Runtime Library) requires that all resources be allocated from the managed heap. You don't need to release the objects in the managed heap-when the application no longer needs these objects

Objects are automatically released.

The memory is not infinite. The garbage collector needs to recycle to release memory. The garbage collector's Optimization engine selects the best recycling time for the allocated items (accurate standards provided by Microsoft

). When the Garbage Collector recycles objects, it first finds out the objects in the managed heap that are no longer used by applications, and then executes corresponding operations to reclaim the memory space of these objects.

However, for automatic memory management, GC must know the root seat. That is to say, it should know when an object will no longer be used by the application. In. net, GC

I learned this through something called metadata .. Every data type used in. NET is described by metadata. With the help of metadata, CLR knows every pair in memory

Garbage collection helps GC. Without this information, GC will not know where an object ends and where the next object starts.

Garbage collection Algorithm
Application Root)
Each application has a set of roots ). The root identifies the storage location, which either points to an object hosting the heap or to an empty object (null ).

For example:

L all global and static object pointers in an application.

L all local variables/parameter object pointers in a thread stack.

L The Object Pointer of all CPU registrations in the managed heap.

L object pointer in the freachable queue.

The tables of the active root are maintained by the JIT compiler and CLR, and the algorithms of the garbage collector are accessed.

Implementation
In. net, garbage collection is implemented by tracking and recycling. Specifically, CLR implements the mark/copact recycler.

This method consists of the following two phases:

Phase I: Mark)
Find the memory that can be withdrawn.

When GC starts running, it assumes that all objects in the heap are junk. In other words, it assumes that the root of the application does not point to any objects in the heap.

Phase I includes the following steps:

1. GC identifies references to surviving objects or application root.

2. Traverse from the root to create a graph of all objects that can be traversed from the root.

3. If GC is about to try to add an object already in the graph, it will stop traversing this path. This has two purposes: the first is to greatly optimize the performance because it will not traverse

A set of objects is more than once. The second is to prevent endless loops when the cyclic connection list of objects occurs, so the loop is effectively controlled.

Once all the roots are checked, the garbage collector graph contains all the objects that can be traversed from the application root. No objects in the diagram can be accessed by the application,

That is, the so-called garbage.

Phase II: compact)
Move all the surviving objects to the end of the heap to empty the space at the top of the heap.

Phase II includes the following steps:

1. GC now traverses the heap linearly and looks for adjacent junk object blocks (now considered as idle space ).

2. Then GC moves down non-junk objects in the memory and removes all gaps in the heap.

3. Moving Objects in the memory results in invalid object pointers. Therefore, GC needs to modify the application root so that the object pointer points to a new location.

4. In addition, if the object contains a pointer to other objects, GC will also be responsible for correcting these pointers.

After all the spam objects are identified, all non-spam objects are also organized, and all non-spam object pointers are also corrected. The pointer after the last non-spam object points to the next added

Location of the added object.

Finalization)

The. NET Framework garbage collector can secretly track the lifecycle of objects created by applications, but when it encounters an object that encapsulates unmanaged resources (such as files, windows, or network connections)

But is powerless.

Once applications no longer use those unmanaged resources, they need to be explicitly released .. Net Framework provides the Finalize method for the object: reclaim the object in the garbage collector

When the object memory is used, you must execute this method to clear its unmanaged resources. Because the default Finalize method does not do anything, if you need to clear resources, you must overwrite this

Method.

If a Finalize method is treated as another name of the destructor in C ++, it is strange. Although they are all assigned tasks to release resources occupied by objects, they still have

Different semantics. In C ++, The Destructor will be called immediately when the object launches the scope, and the Finalize method is called only when the garbage collection and clearing object starts.

In. net, the existence of finalizer makes garbage collection more complicated, because it adds many additional operations before releasing objects.

No matter when a new object containing the Finalize method is allocated on the stack, a pointer to this object will be added to an internal number called the Finalization queue.

Data structure. When the object cannot be traversed again, GC considers the object as garbage. GC first scans the Finalization queue to find the pointers of these objects. When the pointer is found,

Remove it from the Finalization queue and add it to another internal data structure named freachable queue, so that this object is no longer part of the garbage. GC is complete.

Identify the spam. Then sort out the recoverable memory (COMPACT), and a dedicated thread is responsible for clearing the freachable queue and executing the Finalize method of the object.

When the second garbage collector is triggered, it regards finalize objects as real garbage, and then simply releases their memory.

It can be seen that when an object needs to end, it will first die, then survive (resurrected), and then die again and finally. We recommend that you avoid using the Finalize method unless necessary.

The Finalize method increases the memory pressure, because the memory and resources occupied by the object are not released until the two garbage collections are started. Therefore, you cannot control the execution of the Finalize method twice.

The order of rows, which may lead to unpredictable consequences.

Optimization of garbage collection performance
L weakreference)

Generation L (generations)

Weakreference)
Weakreference is a way to improve performance and reduce the pressure on large objects in the managed heap.

When a root points to an object, it is called a strong reference of this object and this object cannot be recycled because the application can traverse this object.

When an object has a weakreference pointing to it, it basically means that this object can be recycled if there is a memory request and the GC starts.

When you attempt to access this object again in sequence, the access will fail. On the other hand, in order to access a weakreference object, the application must obtain

Strong reference of the object. If the application gets a strong reference before the Garbage Collector recycles this object, GC will not be able to recycle this object because a strong reference of this object exists.

The managed heap contains two internal data structures for managing weakreference: short and weak reference tables and long and weak reference tables.

Two types of weak references:

L short and weak references do not trace recovery.

That is to say, an object with short and weak references will be immediately withdrawn, instead of waiting until the Finalize method is run.

L long weak references trace recovery.

In other words, GC recycles the object only when the storage space of the object in the long weak reference table is recoverable. If the object has a Finalize method, the Finalize method is called.

And the object cannot be revived.

These two tables simply store the pointer assigned to objects in the managed heap. At first, both tables are empty. When you create a weakreference object

Allocate data in the management heap. Instead, an empty storage location is allocated to a weak reference table. Short and weak references use short and weak referenced tables, and long and weak references use long and weak referenced tables.

Let's look at an example to see what will happen during GC runtime. The following figure (figure 1 and figure 2) shows the status of all internal data structures before and after GC.

Figure 1: Before GC is run

Figure 2: After GC is run

The following operations are performed when GC is running:

1. GC creates a graph for all objects that can be traversed. In the preceding example, the graph contains objects B, C, E, and G.

2. GC scans short and weak reference tables. If the object pointed to by the pointer in the table is not in the figure, the pointer identifies an object that cannot be traversed. This position in the short weak reference table is set to null.

. In the preceding example, the position of Object D is set to null because it is not part of the graph.

3. GC scans the Finalization queue. If the pointer in the queue is not in the figure, the pointer identifies an object that cannot be traversed.

Move to freachable queue. At this time, the object is considered to be traversal, so it is added to the graph. In the preceding example, objects A, D, and F are non-contained objects in the graph but can be traversed,

Because they belong to the Finalization queue. The Finalization queue is cleared.

4. GC scan for weak reference tables. If the object indicated by the pointer in the table is not in the figure (now the figure includes the object indicated by the pointer in the freachable Queue), the pointer identifies a non-Traversal

Object. The location is set to null. Because the objects C and F are included in the graph, null is not set.

5. GC compact memory to squeeze out the gaps left by objects that cannot be traversed. In the preceding example, object H is the only object deleted from the heap, and its allocated memory is reclaimed.

Generation (generations)
Since garbage collection can be completed only when the entire program is stopped, they may be interrupted for any long time during program execution. GC may also be interrupted to meet the requirements of real-time systems.

Incidents that require timely response.

One feature in GC is called generation (generations), which is specifically designed to improve performance. A multi-generation GC is achieved by observing most programs written in various languages.

The two facts are obtained through careful analysis:

1. The newly created object has a shorter lifecycle.

2. The older the object, the longer it will survive.

The multi-generation recycler divides an object into several groups based on its age, and older objects are recycled more frequently. During initialization, the managed heap does not contain any objects. All new objects

Are added to the 0th generation heap until the heap is full and garbage collection is triggered. Because most objects survive for a short period of time, only a small number of young objects survive the first time they are recycled.

Once an object remains alive after being recycled for the first time, it is upgraded to 1st generations. After garbage collection, we can say that all new objects are in The 0th-generation heap. Only when the heap of The 0th generation is full

Receiving will be triggered again. All objects survived in the 1949th generation were sorted and upgraded to the 1949th generation. Then the 0th generation does not include any objects, but all new objects enter the 0th generation.

Therefore, objects that are "mature" in the current generation (survive in the multi-generation recycler) will be moved to the older generation at the next level. The 2nd generation is the largest generation supported by clr gc. Back later

When receiving, the 2nd generation of surviving objects will simply stay on the 2nd generation.

Therefore, dividing the heap into object generation and collecting and sorting out the objects in the younger generation improve the efficiency of the garbage collection algorithm, because a large amount of meaningful space is reclaimed from the heap, parallel Ratio

It is much faster to start the recycler to check all objects in all generations.

For a multi-generation GC that can be recycled, make sure that (at least as much as possible) the time required for each collection is less than a certain maximum time, so as to help make some supporting real-time operations for the real-time environment.

It also prevents user interruption.

Garbage collection myth
GC is obviously slower than manual Memory Management

Explanation: not certain. The modern Garbage Collector looks as fast as the manual storage allocation (malloc/free or new/delete. In some special programs, garbage collection

It may not run as quickly as the custom memory allocated specially designed for users. But on the other hand, is the extra-quota added for manual memory management (for example, explicit

It is often much more expensive than GC.

GC causes program interruption

Explanation: Since the Garbage Collector often stops the entire application when searching for and collecting garbage objects, they may cause too long interruption and make users aware of it. However

Counting, which can be completely avoided by perceived interruptions.

Manual memory management will not cause interruptions

Explanation: manual memory management does not ensure performance. It may be interrupted due to a large amount of memory allocation or release.

GC programs are large and bloated; GC is not suitable for small programs or systems

Explanation: despite its advantages in using GC in a complex system, there is no reason to consider the large overhead that GC will introduce in its program with a small size.

I have heard that GC will use a large amount of memory twice.

Corresponding explanation: this may be a fact for the original GC, but not for the garbage collector. The data structure for GC is much larger than those for manual memory management.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More