(1) making the Garbage Collector work efficiently)

Source: Internet
Author: User

How to improve the efficiency of garbage collection

This articleArticleThe purpose is to introduce how to use managed memory more efficiently, instead of introducing GC itself, is to explain how to use GC. I suppose most readers are interested in how to make good use of GC, rather than implementing a GC by themselves. This article requires a basic understanding of GC. Jeff Richter wrote two great articles about GC. I have translated these two articles, if you do not know GC at all, we recommend that you first check the Principles 1 and 2 of garbage collection.

First, I will focus on workstation GC (so all the numbers are for workstation GC ). Then I will talk about the difference between the GC of the workstation and the GC of the server (sometimes you don't have to choose, I will explain why later ).

Generation:

The object on the managed stack is divided into three generations to optimize the performance of garbage collection. Most objects die in the zero generation. For example, in a serverProgramProcessing objects related to each request will die after the request is complete. Essentially, the first-generation object is a buffer between the newly allocated object and the resident memory. When you look at the performance counter, the number of 2-generation recycles is less than that of 0-generation recycles. The number of one-generation recycles is not very important. It is not much more costly to recycle one-generation objects than to recycle zero-generation objects. The collection of 2-generation objects means scanning the entire managed heap at a much higher cost.

GC segment (segment ):

First, let's take a look at how GC applies for memory from the operating system. GC retains memory in segments. Each segment is 16 MB (64 MB in server mode ). When the execution engine is started, we retain the initial GC segments, one for small objects and the other for large objects. For garbage collection of Large Object heaps, refer to here

You can apply for more memory from the operating system or return it to the operating system as needed. When all segments are used up, we apply for a new segment. The excess segments are returned to the operating system after each complete garbage collection.

Large objects have their own segments. The Garbage Collector processes large objects differently from small objects, so large objects do not share segments with small objects.

Allocation:

What is the cost when you allocate an object to the hosting stack? If we do not consider recycling, there are two points: 1 is moving the Pointer Forward, 2 is clearing the memory for the new object. For objects that implement the Finalize method, the object pointer must be placed in the end queue.

Note that I am talking about "if we don't consider recycling"-this means that the allocation cost is proportional to the object size. The fewer applications, the lower the GC cost. If you need 15 bytes, apply for 15 bytes. do not apply for 32 bytes like using maalloc. There is a threshold value. When this threshold value is exceeded, garbage collection is triggered. You need to trigger garbage collection as little as possible.

The GC heap differs from the NT heap: The closer the object is allocated, the closer the object is on the GC heap.

Each object allocated on the GC stack requires an additional 8 bytes of overhead. 4 bytes are used for synchronization and 4 bytes are used to store table pointers.

Recycling:

First, we need to know when to trigger recycling? There are three scenarios that will be triggered:
1. The allocation exceeds the threshold value of the zero-generation heap.
2. The GC. Collect () method is called.
3. The operating system sends a low memory signal to the application

1st cases are the most typical trigger reason. When there are enough objects allocated, the garbage collection of the 0-generation heap will be triggered. After each recycle, the zero-generation heap is cleared. Then, the object will be allocated. After the 0-generation heap is filled up, the next recycle will be triggered.

You should try to avoid 2nd cases. This is very simple, not in the programCodeYou can call the GC. Collect method. Generally, you should not call the collect method. Bcl is basically the only place that shoshould call this (in very limited places); when you call GC in a program. during the collect method, the performance is reduced because the recycle is executed in advance, and the garbage collector executes the recycle SchedulingAlgorithmOptimized.

In 3rd cases, it is affected by other programs running on the operating system. Your program cannot be controlled. You can only optimize your program or module as much as possible.

Let's talk about what this means. First, managed heaps are part of the program's working set. It consumes private pages. Ideally, all objects die in the zero generation (which means that almost all objects are recycled in the zero generation, and full recovery will never happen, your GC heap will never exceed the size of the zero-generation heap. In fact, this situation is impossible. Therefore, you really need to ensure that the size of the managed heap is controllable.

Second, you need to ensure that the time consumed by garbage collection is controllable. This means that the GC should be triggered as little as possible, and the GC of the higher generation should be triggered as little as possible. The cost of a higher-generation recycling is much higher than that of the lower-generation, because the higher-generation recycling needs to scan more objects and execute all the lower-generation recycling at the same time.

Clrprofiler is a tool used to observe the GC heap and check that the object on the stack is referenced by that object. It is very good.

How to organize your data:

1) Use value type or reference type

As you know, value type data is stored on the stack, and reference type objects exist on the managed stack. Therefore, people will ask how to determine when to use the value type and what to use the reference type. The value type does not trigger garbage collection, but if your value type is often packed, the packing operation is much more expensive than creating a reference type object at the beginning; when the value type object is passed as a parameter, You need to copy it. However, if your reference type has only one small Member, if you make it a reference type, you also need four additional bytes of pointer overhead, synchronization overhead, and method table overhead. Therefore, the value type or reference type is determined by the Type itself.

2) Reference rich object)
If an object is rich-referenced, it will put pressure on both allocation and recycling. Each embedded object requires an additional 8 bytes of overhead. Because the allocation overhead is proportional to the object size, the overhead is larger. In addition, rich references increase the time needed to build an object graph and increase the overhead for recycling.

Therefore, I suggest you design only necessary fields when designing an object. If a strong reference to another reference type is not necessary, do not reference it. You should try to prevent objects that have been in use for a long time from referencing new objects.

3) terminable objects (objects that implement the Finalize method)
As described in principle 1 of garbage collection, the end object will prolong the collection time, not only to extend the end object itself, but also to extend the collection time of all objects downstream of its reference chain. Therefore, if the object must be terminable, you must isolate it as much as possible so that it will not reference other objects.

4) storage location of objects:
When you allocate space for a sub-object of an object, you 'd better allocate the parent object and sub-object at the same time so that the addresses of the Parent and Child objects on the managed stack will be together, it will be recycled together, and the efficiency will be relatively higher.

Large Object:

When the memory occupied by an object exceeds 85,000 bytes, it is allocated to the LOH. The SOH segment will never be moved-rather than clearing objects (using an empty linked list ). However, this is an implementation details. You should not rely on this implementation details. If you allocate a large object and do not want it to move, you should fix it.

Only when two generations are recycled will large objects be recycled, while the cost of two generations is very high. Sometimes you will find that the size of the two-generation heap does not change much after the two-generation recycle, probably because the size of the large object heap exceeds the threshold and triggers the two-generation recycle.

A good practice: allocate a large object and reuse it. If you need a large object of K or K, you should apply for a K object and reuse it again. Multiple allocation of temporary large objects may trigger two-generation recovery, which has a negative impact on performance.

This article is a translation article:
Address: http://blogs.msdn.com/ B /maoni/archive/2004/06/15/156626.aspx
Original Author: maoni Stephen ens

 

Related essays:

. Net garbage collection mechanism principle (1)

. Net garbage collection mechanism principle (2)

. Net garbage collection and big object processing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.