Principle of. Net garbage collection mechanism (2)

Source: Internet
Author: User

The previous article introduced. the basic principles of net garbage collection and the internal mechanism of implementing the Finalize method for garbage collection; this article describes weak reference objects, generation, and multi-thread garbage collection, large Object processing and performance counters related to garbage collection.

Let's start with the weak reference object. The weak reference object can reduce the memory pressure caused by large objects.

Weak references (weak references)

WhenProgramWhen the root object of is directed to an object, this object is reachable, And the Garbage Collector cannot recycle it. This is called a strong reference to the object. Strong references are relatively weak references. When a weak reference exists on an object, the garbage collector can recycle this object, but it also allows the program to access this object. What's the problem? Read down.

If a weak reference exists on an object and the garbage collector is running, the object will be recycled. if the object is to be accessed in the program, the access will fail. On the other hand, to use a weak referenced object, the program must first make a strong reference to this object. If the program strongly references the object before the Garbage Collector recycles this object, in this way, the garbage collector cannot recycle this object. This is a bit difficult.CodeLet's explain:

 
Void method () {// create a strong reference object o = new object (); // use a short weak reference object to weak reference O. weakreference wR = new weakreference (o); O = NULL; // remove strong references to an object o = Wr. target; // try to obtain the strong reference of the object from the weak reference object if (O = NULL) {// if the object is empty, it indicates that the object has been recycled by the garbage collector} else {// If the Garbage Collector has not recycled this object, you can continue to use the object }}

Why do we need weak objects? Because some data is easy to create, but it requires a lot of memory. For example, if you have a program that needs to access all folders and file names on the user's hard disk, you can access the user's disk to generate data when the program needs the data for the first time, after the data is generated, you can access the data in the memory to obtain the user file data, instead of reading the data from the disk every time. This improves the program performance.

The problem is that this data may be quite large and requires a considerable amount of memory. If the user operates another part of the program's functions, this large memory is unnecessary. You can delete the data through code, but if the user switches to the function that requires the data, you must re-build the data from the user's disk. Weak references provide a simple and effective solution for this scenario.

When you switch to other functions, you can create a weak reference object for the data and remove the strong reference to the data. In this way, if the memory occupied by the program is very low, the garbage collection operation will not be triggered, and the weak reference object will not be recycled; in this way, when the program needs to use this piece of data, it can obtain the data through a strong reference. If the program successfully obtains the object reference, the program does not need to read the user's disk again.

The weakreference type provides two constructors:

 
Weakreference (object target); weakreference (object target, bool trackresurrection );

The target parameter is obviously a weak reference object to be tracked. The trackresurrection parameter indicates whether to track the object after the Finalize method of the object is executed. The default value is false. For details about object resurrection, refer to here.

For convenience, the weak reference that does not trace the Resurrection object is called "short weak reference", while the weak reference that wants to trace the Resurrection object is called "Long weak reference ". If the object does not implement the Finalize method, the long weak reference and short weak reference are exactly the same. We strongly recommend that you avoid using long weak references whenever possible. Long weak references allow you to use the Resurrected object, and the behavior of the Resurrected object may be unpredictable.

Once you use weakreference to reference an object, we recommend that you set all the strong usages of this object to null. If a strong reference exists, the garbage collector never recycles objects pointed to by weak references.

When you want to use a weak reference target object, you must create a strong reference for the target object. This is very simple, as long as object A = weekrefer is used. target; then, You must judge whether a is empty. If the weak is not empty, you can continue to use the object. If the weak is empty, it indicates that the object has been recycled by the garbage collector, you must obtain this object again using other methods.

Internal Implementation of weak references

In the previous descriptions, we can infer that the processing of weak referenced objects is definitely different from that of general objects. In general, if an object references another object that is strongly referenced, the garbage collector cannot recycle the referenced object, but the weakreference object is not like this, the referenced object may be recycled.

To fully understand how weak objects work, we also need to take a look at the hosting heap. There are two internal data structures on the managed stack. Their only role is to manage weak references. We can call them long weak reference tables and short weak reference tables; these two tables store weak reference target object pointers on managed stacks.

At the beginning of the program running, both tables are empty. When you create a weakreference object, this object is not allocated to the managed stack, but an empty slot (empty slot) is created in the weak object table ). The short weak reference object is placed in the short weak object table, and the long weak reference object is placed in the long weak reference table.

Once an empty slot is found, the value of the empty slot is set to the address of the weak reference target object. Obviously, objects in the long and short objects table are not treated as the root object of the application. The garbage collector does not recycle the data in the table of long and short objects.

Let's take a look at what happened during garbage collection:
1. The Garbage Collector builds an accessible object graph. For the construction steps, refer to the above
2. the garbage collector scans the short and weak object table. If the object to which the weak object table points is not in the reachable object graph, the object is identified as a junk object, then, the object pointer in the short object table is set to null.
3. the garbage collector scans the end Queue (refer to the above). If the objects in the queue are not in the reachable object graph, this object is moved from the end queue to the freachable queue. At this time, this object is marked as an reachable object, and it is no longer a junk object.
4. the garbage collector scans weak reference tables. If the objects in the table are not in the reachable object graph (the reachable object graph includes objects in the freachable Queue), set the Object Pointer corresponding to the long reference object table to null.
5. The garbage collector moves reachable objects.

Once you understand how the garbage collector works, it is easy to understand how weak references work. When the target attribute of weakreference is accessed, the system returns the target object pointer in the weak object table. If it is null, the object has been recycled.

Short weak references do not follow up, which means the garbage collector can check whether the objects pointed to in the weak reference table are spam objects before scanning the end queue.

The long weak reference trace is used to restore the object, which means that the garbage collector can set the pointer in the weak reference table to null only after confirming that the object is recycled.

Generation:

When it comes to. Net garbage collection, C ++ or C programmers may wonder if there will be performance problems in memory management. GC developers have been adjusting the Garbage Collector to improve its performance. Generation is a mechanism to reduce the impact of garbage collection on performance. When the garbage collector is working, it assumes that the following statement is true:

1. the newer an object, the shorter its lifecycle
2. The older an object, the longer its lifecycle will be.
3. There is usually a reference relationship between the new object and the new object.
4. compressing part of the heap is faster than compressing the whole heap.

Of course, a lot of research proves that the above assumptions are true in many procedures. Let's talk about how these assumptions affect the work of the garbage collector.

During program initialization, there are no objects on the managed stack. At this time, the generation of the newly added objects on the hosting stack is 0, as shown in. The 0 generation objects are the youngest objects and they have never been checked by the garbage collector.

Figure 1 zero-generation objects on the managed Stack

If more objects are added to the heap, garbage collection is triggered when the heap is full. When the Garbage Collector analyzes the managed heap, it constructs a spam object (the pale purple block in figure 2) and a non-spam object graph. All objects not recycled will be moved and compressed to the bottom of the heap. These unrecycled objects become the first-generation objects, as shown in figure 2.

Figure 2 zero-Generation 1-generation objects on the managed Stack

When more objects are allocated to the stack, the new object is placed in the zero-generation zone. If the zero-generation heap is full, a garbage collection is triggered. At this time, the live object becomes the first generation object and is moved to the bottom of the heap. After garbage collection, the surviving objects in the first generation object will be promoted to the second generation object and moved and compressed. 3:

Figure 3 hosting the 0, 1, and 2 generation objects on the stack
The second-generation object is currently the highest generation of the garbage collector. When garbage collection is performed again, the algebra of objects not recycled remains 2.

Why can I optimize the performance of garbage collection by generation?

As mentioned above, generation division recovery can improve performance. When the heap is full, garbage collection is triggered. The garbage collector can select only the objects on the 0 generation for garbage collection, and ignore the objects on the higher generation for stacking. However, because the younger the object has a shorter life cycle, a considerable amount of memory can be recycled from the zero-generation heap, And the collected performance is much less than that of all generation objects.

This is the simplest Optimization of generational garbage collection. Generational collection does not need to facilitate the entire managed heap. If a root object references a high-generation object, the garbage collector can ignore the traversal of the High-generation object and its referenced object, this greatly reduces the time required to build reachable object graphs.

If the 0-generation object is recycled and sufficient memory is not released, the garbage collector will try to recycle the 1-generation and 0-generation heaps. If not, the garbage collector will try to recycle 2, 1, zero-generation heap. The specific generation of objects will be recycledAlgorithmNot sure. Microsoft will continue to optimize the algorithm.

Most heaps (like the C-runtime heap) are allocated to objects as long as enough idle memory is found. Therefore, if I allocate multiple objects consecutively, the address space of these objects may be several Mb different. However, on the managed stack, the memory address of the continuously allocated object is continuous.

As mentioned in the preceding assumptions, new objects are more likely to be referenced by each other. Therefore, when the new object is allocated to the continuous memory, you can obtain the performance optimization (you gain performance from locality of reference) referenced nearby ). In this case, it is very likely that all your objects are in the cache of the CPU, so that many CPU operations do not need to access the memory.

Microsoft's performance tests show that the hosting heap is faster than the standard Win32 heapalloc method. These tests also show that the cpu Of The MHz Pentium can perform a 0-generation recovery time less than 1 millisecond. Microsoft's optimization aims to reduce the time consumed by garbage collection to a common page error.

Use the system. GC class to control garbage collection

Type System. GC running developers directly control the garbage collector. You can use the GC. maxgeneration attribute to obtain the highest algebra of GC. Currently, the highest generation is set value 2.

You can call the GC. Collect () method to force the Garbage Collector to perform garbage collection. The collect method has two reloads:

 
Void GC. Collect (int32 generation) void GC. Collect ()

The first method allows you to specify the generation to be recycled. You can pass 0 to GC. the number of maxgeneration is used as a parameter, and 0 is used as the collection of the zero-generation heap. 1 is used to recycle the One-generation and zero-generation heap, while 2 is used to recycle the entire managed heap. The method without parameters calls GC. Collect (GC. maxgeneration) is equivalent to the whole collection.

In general, the GC. Collect method should not be called; it is best to let the Garbage Collector determine when to call the collect method according to its own algorithm. Even so, if you are more confident that you know when to do garbage collection than during the runtime, you can call the collect Method for garbage collection. For example, a program can perform garbage collection after saving the data file. For example, if your program just runs out of a large array with a length of 10000 and you no longer need it, you can set it to null and perform garbage collection to relieve the memory pressure.

GC also provides the waitforpendingfinalizers method. This method simply suspends the execution thread. After the freachable queue is cleared, the execution continues after the Finalize method in all queues is executed.

GC also provides two methods to return several generations of an object. They are

 
Int32 GC. getgeneration (Object O); int32 GC. getgeneration (weakreference wr)

The first method returns the generations of common objects, and the second method returns the algebra of weak referenced objects.

The following code helps you understand the meaning of the generation:

Private Static void generationdemo () {// Let's see how many generations the gch supports (we know it's 2) display ("Maximum GC generations:" + GC. maxgeneration); // create a new baseobj In the heap genobj OBJ = new genobj ("Generation"); // since this object is newly created, it shoshould be in generation 0 obj. displaygeneration (); // displays 0 // specify a garbage collection promotes the object's Generation GC. collect (); obj. displaygeneration (); // displays 1 GC. collect (); obj. displaygeneration (); // displays 2 GC. collect (); obj. displaygeneration (); // displays 2 (max generation) OBJ = NULL; // destroy the strong reference to this object GC. collect (0); // collect objects in generation 0 GC. waitforpendingfinalizers (); // We shoshould see nothing GC. collect (1); // collect objects in generation 1 GC. waitforpendingfinalizers (); // We shoshould see nothing GC. collect (2); // same as collect () GC. waitforpendingfinalizers (); // now, We shocould see the finalize // method run display (-1, "demo stop: Understanding generations. ", 0);} class genobj {public void displaygeneration () {console. writeline ("My generation is" + GC. getgeneration (this ));}~ Genobj () {console. writeline ("My Finalize method called ");}}

Multi-thread performance optimization of the garbage collection mechanism

In the previous section, I explained the GC algorithm and optimization, and then discussed the premise that it is in the case of a single thread. In a real program, it is likely that multiple threads work together, and multiple threads manipulate the objects on the hosting stack together. When a thread triggers garbage collection, all other threads should suspend access to any referenced objects (including the objects referenced on their own stacks), because the Garbage Collector may want to move objects, modifies the memory address of an object.

Therefore, when the Garbage Collector starts to recycle, all threads that execute managed code must be suspended. Several different mechanisms can be used to safely suspend threads for garbage collection. I am not going to elaborate on the internal mechanism of this module. However, Microsoft will continue to modify the garbage collection mechanism to reduce the performance loss caused by garbage collection.

The following sections describe how the garbage collector works with multiple threads:
Code execution is totally interrupted. When garbage collection starts, all application threads are suspended. The garbage collector then records the thread suspension location in a table generated by the just-in-time (JIT) compiler. The Garbage Collector records the thread suspension location in the table, record the currently accessed object and the location where the object is stored (variable, CPU register, etc)
Hijacking: the garbage collector can modify the thread stack to point the return address to a special method. When the current method is returned, this special method will be executed and the thread will be suspended, this method of changing the thread execution path is called hijacking. After the garbage collection is completed, the thread will return to the method previously executed.

Security: When the JIT compiler compiles a method, you can insert a piece of code at a certain point to determine whether the GC is suspended. If yes, the thread suspends and waits for the garbage collection to complete, and then the thread starts execution again. The location where the JIT compiler inserts and checks GC Code is called a "security point"

Note that thread hijacking allows a thread that is executing unmanaged code to be executed during garbage collection. It is okay if the unmanaged code does not access the objects on the managed stack. If the thread executes the unmanaged code and then returns the executed managed code, the thread will be hijacked until the garbage collection is completed.

In addition to the centralized mechanism I just mentioned, the garbage collector has other improvements to enhance Object Memory Allocation and collection in multi-threaded programs.

Synchronization-Free allocations: In a multi-threaded system, the zero-generation heap is divided into several regions, and one thread uses one region. This allows multiple threads to allocate objects at the same time and does not require a thread to exclusively occupy the heap.

Scalable collections: the version of the server (mxsorsvr. dll) that runs the execution engine in a multi-threaded system. The managed heap is divided into several different regions and one CPU area. During initialization, each CPU executes a recycle thread and each thread recycles its own region. The Workstation version of the execution engine (mxcorwks. dll) does not support this function.

Large Object recycling

This part will not be translated. There is a special article to talk about it.

Monitor garbage collection

If you have installed. net Framework. net CLR memory. You can select a program from the instance list for observation, as shown in.

The specific meanings of these performance indicators are as follows:

Performance counters

Description

# Bytes in all heaps(Number of bytes in all heap)

Displays the sum of the following counter values: 0th-level heap size counters, 1st-level heap size counters, 2nd-level heap size counters, and large object heap size counters. This counter indicates the current Memory allocated on the garbage collection stack (in bytes ).

# GC handles(GCNumber of processes)

Displays the current number of garbage collection processes in use. Garbage collection is used to process resources outside the public Language Runtime Library and hosting environment.

# Gen 0 collections(2Recycling times)

Displays the number of times that 0th-level objects (I .e. the youngest and most recently allocated objects) have been reclaimed since the application was started.

0th garbage collection occurs when the available memory in level 1 is insufficient to meet the allocation request. This counter increases progressively at the end of level 0th garbage collection. Higher-level garbage collection includes all lower-level garbage collection. When higher-level (1st or 2nd) garbage collection occurs, this counter is explicitly increased.

This counter displays Recent observed values._ Global _The counter value is not accurate. Ignore it.

# Gen 1 collections(2Recycling times)

Displays the number of times that a 1st-level object is reclaimed after the application is started.

This counter increases progressively at the end of level 1st garbage collection. Higher-level garbage collection includes all lower-level garbage collection. When higher-level (2nd-level) garbage collection occurs, this counter is explicitly increased.

This counter displays Recent observed values._ Global _The counter value is not accurate. Ignore it.

# Gen 2 collections(2Recycling times)

Displays the number of times that a 2nd-level object is reclaimed after the application is started. This counter increases at the end of the 2nd-level garbage collection (also known as full garbage collection.

This counter displays Recent observed values._ Global _The counter value is not accurate. Ignore it.

# Induced GC(CausedGC)

Displays the number of garbage collection peaks performed due to explicit calls to GC. Collect. It is feasible for the garbage collector to fine-tune the recycling frequency.

# Of pinned objects(Number of pinned objects)

Displays the number of pinned objects in the last garbage collection. The pinned object is the object that the Garbage Collector cannot move into the memory. This counter only tracks the pinned objects in the heap for garbage collection. For example, a garbage collection of hundreds of thousands causes only the objects pinned in the hundreds of thousands of heaps to be enumerated.

# Of sink blocks in use(Number of receiving blocks in use)

Displays the current number of synchronization blocks in use. A synchronization block is an object-based Data Structure allocated for storage synchronization information. The synchronization block retains weak references to hosted objects and must be scanned by the garbage collector. Synchronization blocks are not limited to storing only synchronization information; they can also store com InterOP metadata. This counter indicates performance issues related to the overuse of synchronization elements.

# Total committed bytes(Total number of submitted bytes)

Displays the amount of virtual memory (in bytes) currently submitted by the garbage collector ). The submitted memory is the physical memory of the space reserved in the disk page file.

# Total reserved bytes(Total number of reserved bytes)

Displays the amount of virtual memory currently retained by the garbage collector (in bytes ). Reserved memory is the virtual memory space reserved for the application (but no disk or main memory page is used.

% Time in GC(GCPercentage of Time)

Displays the percentage of running time for garbage collection since the last garbage collection cycle. This counter usually indicates that the Garbage Collector represents the work performed by the application to collect and compress the memory. This counter is updated only at the end of each garbage collection. This counter is not an average value; its value reflects the value obtained from recent observations.

Allocated Bytes/second(Number of bytes allocated per second)

Displays the number of bytes allocated on the garbage collection stack per second. This counter is updated at the end of each garbage collection instead of at each allocation. This counter is not an average value over a period of time; it shows the result of dividing the difference of the observed values in the last two samples by the sampling interval.

Finalization into vors(Number of remaining objects upon completion)

Displays the number of garbage collection objects that are retained after being recycled due to waiting for completion. If these objects retain references to other objects, those objects are also retained, but this counter does not count them. The counters of "Improved memory from 0th" and "Improved memory from 1st" indicate all memory reserved for completion.

This counter is not a cumulative counter; at the end of each garbage collection, it is updated by the count of the remaining objects only during this specific collection period. This counter indicates that system overhead may be caused by application completion.

Gen 0 heap size(2Heap size)

The maximum number of bytes that can be allocated in level 0th. It does not indicate the number of bytes currently allocated in level 0th.

A garbage collection of 0th level occurs when the allocated amount from the latest recycle exceeds this size. The level 0th size is fine-tuned by the garbage collector and can be changed during application execution. The size of the 0th-level heap is 0 bytes at the end of The 0th-level recycle. This counter shows the allocated size (in bytes) of the next 0th-level garbage collection call ).

This counter is updated at the end of garbage collection (instead of at each allocation.

Gen 0 promoted Bytes/sec(From1Bytes in ascending order/Seconds)

The number of bytes per second increased from 0th to 1st. It is promoted after being retained from the garbage collection. This counter is the identifier of an object created per second that has been retained for a long time.

This counter shows the difference between the values observed in the last two samples (divided by sampling interval duration.

Gen 1 heap size(2Heap size)

Displays the number of current bytes in the 1st level. This counter does not display the maximum size of the 1st level. Objects are not directly allocated in this generation; these objects are upgraded from the first stage of garbage collection. This counter is updated at the end of garbage collection (instead of at each allocation.

Gen 1 promoted Bytes/sec(From1Bytes in ascending order/Seconds)

The number of bytes per second increased from 1st to 2nd. This counter does not include objects that are promoted just because they are waiting for completion.

It is promoted after being retained from the garbage collection. It will not be upgraded from level 2nd because it is the oldest level. This counter is the identifier of an object created every second that has been retained for a very long time.

This counter shows the difference between the values observed in the last two samples (divided by sampling interval duration.

Gen 2 heap size(2Heap size)

Displays the number of current bytes in the second level. Objects are not directly allocated in this generation; these objects are upgraded from 1st during the previous 1st garbage collection period. This counter is updated at the end of garbage collection (instead of at each allocation.

Large Object heap size(Large object heap size)

Displays the current size (in bytes) of a large object heap ). The garbage collector treats objects larger than 20 KB as large objects and directly allocates large objects in the special heap. They are not upgraded through these levels. This counter is updated at the end of garbage collection (instead of at each allocation.

Promoted finalization-memory from gen 0(From1Memory upgraded)

Displays the number of bytes in memory that are upgraded from 0th to 1st just as the wait is complete. This counter is not a cumulative counter; it displays the value observed at the end of the last garbage collection.

Promoted finalization-memory from Gen 1(From1Memory upgraded)

Displays the number of bytes in memory that are upgraded from 1st to 2nd just as the wait is complete. This counter is not a cumulative counter; it displays the value observed at the end of the last garbage collection. If the last garbage collection is 0th, this counter is reset to 0.

Promoted memory from gen 0(From1Memory to be upgraded)

The number of bytes that are retained after garbage collection and increased from 0th to 1st. This counter does not include objects that are upgraded just waiting for completion. This counter is not a cumulative counter; it displays the value observed at the end of the last garbage collection.

Promoted memory from Gen 1(From1Memory to be upgraded)

The number of bytes that are retained after garbage collection and increased from 1st to 2nd. This counter does not include objects that are upgraded just waiting for completion. This counter is not a cumulative counter; it displays the value observed at the end of the last garbage collection. If the last garbage collection is 0th, this counter is reset to 0.

This table is from msdn

The full text is complete. Http://msdn.microsoft.com/zh-cn/magazine/bb985011 (En-US). aspx

Original Author: Jeffrey Richter. Thanks to the original author.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.