Garbage Collector basics and performance tips

Source: Internet
Author: User
ArticleDirectory
    • Too many distributions
    • Too large allocation
    • Too many pointers
    • Too many roots
    • Too many object writes
    • Too many objects with long lifecycles
    • How to terminate recovery
    • Termination and Performance
    • Idisposable and Dispose
Performance

Now that we have a basic model for how to perform operations, let's consider some problems that may cause errors and reduce the performance of the model. This allows us to better understand what we should try to avoid, so that the recycler can get the best performance.

Too many distributions

This is indeed the most basic cause of errors. It is indeed very fast to use the Garbage Collector to allocate new memory. As you can see in Figure 2 above, generally all things that need to happen are moving the allocation pointer, to create a space for the new object on the allocated side-it is not much faster than that. But garbage collection always happens sooner or later (everything is the same), and it is better to happen later than early. Therefore, when creating a new object, you must ensure that this operation is indeed required and appropriate, even if only one object is created quickly.

This may sound like an obvious suggestion, but in fact you can easily forget a small lineCodeMultiple distributions are triggered. For example, if you have compiled a comparison function and your object has a keyword field, you want to perform case-insensitive comparison based on the keyword in the given order. Now, in this case, you cannot just compare the entire keyword string, because the first keyword may be very short. You may want to useString. SplitThe keyword string is divided into several segments, and then each segment is compared in order using a standard case-sensitive comparison method. It sounds great, isn't it?

Okay, then we will see that this is not a good idea. BecauseString. SplitA string array is created, which means that each keyword in the original keyword string has a new String object, and the array also has an object. Note! If you do this in a certain context, there will be a lot of comparison operations. Now, your two-line comparison function creates a large number of temporary objects. The garbage collector suddenly increases load because of you, and even uses the most intelligent recycling solution, there will be a lot of garbage to be cleared. It is best to compile a comparison function that does not need to allocate memory at all.

Too large allocation

If you use a traditional distributor, suchMalloc (),ProgramThe code written by the operator is usually called as little as possibleMalloc ()Because they know that the allocation overhead is quite large. This method is converted to block-based allocation. It is usually used to randomly allocate objects we may need so that we can allocate fewer objects in total. Then, manually manage pre-allocated objects from an allocation pool to effectively create a high-speed custom distributor.

In the hosting world, this approach is less attractive for the following reasons:

First, the overhead of execution allocation is very small-because it does not need to search for available memory blocks as traditional distributors do; all operations that need to be performed only need to move the boundary between available and allocated areas. The allocation overhead is small, which means that the most attractive reason for using the pool to manage memory allocation no longer exists.

Second, if you really want to select the pre-allocation method, of course, the generated allocation amount will be more than the allocation amount required for the immediate method, which in turn will force an Additional garbage collection operation, this may not be required in other ways.

Finally, the garbage collector cannot recycle the space of the objects you manually recycle, because from a global perspective, all these objects (including those that are not currently in use) are still active. You may find that the method of waiting for use will waste a lot of memory, but the objects in use will not.

This does not mean that pre-allocation is always a bad idea. For example, you may want to force some objects to be allocated together at the beginning by doing so, but you may find that, compared with in unmanaged code, using it as a conventional strategy is less attractive.

Too many pointers

If the data structure you create has many pointers, you will have two problems. First, there will be a lot of objects written (see figure 3 below). Second, when the time to recycle the data structure is up, you will have the Garbage Collector trace all these pointers. If necessary, you also need to change these pointers as the object moves around. If your data structure has a long lifecycle and does not have many changes, the recycler only needs to access all these pointers when full recovery occurs (at the gen2 level. However, if the lifecycle of the structure you have created is short (that is, as part of the transaction processing), you will pay a lot more than usual.

Figure3.Too many data structures with pointers

If there are too many data structures with pointers, there will be other issues unrelated to the garbage collection time. As discussed earlier, when objects are created, they are allocated continuously in the allocation order. For example, if you restore information from a file and create a large, possibly complex data structure, this is a good thing. Even if you have completely different data types, all objects are tightly tied together in the memory, which will help the processor quickly access these objects. However, as time passes and the data structure is modified, new objects may need to be appended to old objects. These new objects are created very late, so they are no longer close to the original object in the memory. Even when the Garbage Collector compresses the memory, the objects are not re-arranged in the memory. They just slide together to delete the wasted space. The resulting confusion may become very bad after a while, so that you may be inclined to create a new copy of your entire data structure and pack it all, then let the recycler discard the old unordered data structure when appropriate.

Too many roots

The garbage collector must, of course, give special treatment to the root when executing the recycle-they must always be enumerated in sequence and fully considered. Gen0 collection is faster as long as you do not think it is root flooding. If you want to create a deep recursive function with many object pointers in its local variables, the actual result will be costly. The cause of this overhead is not only that all these roots must be taken into account, the reason is that the number of gen0 objects that keep them alive for not a very long time is huge (as discussed below ).

Too many object writes

Repeat the previous discussion, remember that every time the Host Program modifies the object pointer, it will also trigger write barrier code. This may be terrible for two reasons:

First, the overhead of the write barrier can be compared with the overhead of the operation you want to try first. For example, if you perform a simple operation with an enumerator class, you may find that you need to move some key pointers from the master recycle process to the enumerator in each step. This is actually something you may want to avoid, because the write barrier actually doubles the overhead of copying these pointers, in addition, you may have to do this once or multiple times for the enumerator in each loop.

Second, if you write older objects, the write barrier is triggered twice as bad as the original one. When you modify an older object, you actually create an additional root to be checked when the current garbage collection occurs (as discussed above ). If you modify too many old objects, it will actually offset the speed increase caused by only recycling the youngest generation.

Of course, apart from these two reasons, the common causes for not performing too many write operations in any type of program also apply. Everything is the same. The smaller the memory usage (in fact, reading or writing), the better, so that the CPU cache can be used more economically.

Too many objects with long lifecycles

Finally, the biggest defect of the generation-based Garbage Collector may be the creation of many objects, which are neither completely temporary nor completely long-lived. Because they will not be cleaned up by gen0 recycling (the cheapest recycling) (because they are still necessary ); and they may even survive after gen1 is recycled (because they are still in use), but soon after that they will die, so these objects may cause a lot of trouble.

The trouble is that once an object has reached the gen2 level, it can be removed only when it is completely recycled, as long as there is a reasonable possibility that the garbage collector will delay such a collection as much as possible. Therefore, the result of many objects with long lifecycles is that gen2 will often grow at a potentially dangerous rate; it may not be cleared as quickly as you think, and when it is actually cleared, it will certainly exceed your expectation.

To avoid these types of objects, the best defense measures include the following:

Allocate as few objects as possible. Pay attention to the number of temporary spaces in use.

Minimize the size of objects with long lifecycles.

Minimize the number of object pointers on the stack (they are the root ).

If you do these things, your gen0 collection is likely to be highly effective and gen1 will not grow very quickly. As a result, the recovery frequency of gen1 can be greatly reduced. When it becomes very careful to perform gen1 recovery, your medium-length lifecycle object will have died, in addition, these objects can be restored with low overhead at this time.

If this happens, your gen2 size will not increase during the stable operation!

Back to Top

End

Now that we have discussed several theme items with a simplified allocation model, I want to make things more complex so that we can discuss a more important phenomenon. This is the finalizer) and finalization overhead. Simply put, The Terminator can appear in any class-it is an optional member, and the garbage collector promises to call the terminator of the object before it recycles the memory of an object that is not dead. In C #, use ~ Class syntax specifies the Terminator.

How to terminate recovery

When the Garbage Collector encounters an object that is dead but not dead but still needs to be terminated for the first time, it must give up its attempt to recycle the object space at this time. Instead, add the object to the list of objects to be terminated, and the recycler must then ensure that all pointers in the object remain valid until the termination is complete. This is basically equivalent. From the perspective of the recycler, each object to be terminated is like a temporary root object.

Once the collection is completed, the named termination thread traverses the list of objects to be terminated and calls the Terminator. When this operation is completed, the object becomes a dead object again and will be recycled in a normal way.

Termination and Performance

With a basic understanding of the end, we can export some very important things:

First, the lifetime of the object to be terminated is longer than that of the object to be terminated. In fact, they can live much longer. For example, assume that the object in gen2 needs to be terminated. The end will proceed as planned, but the object is still in gen2, so it will not be recycled until the next gen2 recycle occurs. This really takes a long time. In fact, if it goes well, it will take a long time to live, because the overhead of gen2 recycling is very high, so weHopeThey rarely happen. Older objects to be terminated may have to wait for dozens of gen0 collections even if they are not hundreds of times before their space can be reclaimed.

Second, objects to be terminated will cause indirect losses. Because internal object pointers must be valid, not only do the objects that need to be terminated immediately stay in the memory, but all the objects directly and indirectly referenced by the object will also be kept in the memory. If a large object tree is fixed because of an object to be terminated, the entire tree may stay in the memory for a long time, as we discussed earlier. Therefore, it is very important to save the use of Terminators and place them in objects with as few internal object pointers as possible. In the example tree just mentioned, you can easily avoid this problem by moving the resource to be terminated to a separate object and retaining the reference to the object in the root of the tree. With this small change, only one object (hopefully a small object) will remain in the memory, and the overhead of termination will be minimized.

Finally, the object to be terminated will create a job for the terminator thread. If the termination process is complex, one and only Terminator thread will spend a lot of time executing these steps, which leads to a backlog of work, this will cause more objects to stay in the memory and wait for the termination. Therefore, it is very important for the terminator to do as little work as possible. Keep in mind that although all object pointers are valid during the final period, they may point to objects that have been terminated and therefore are no longer so useful. Generally, the safest way is to avoid tracing object pointers in the final code, even if these pointers are valid. Secure and brief termination code is the best choice.

Idisposable and Dispose

In many cases, for objects that need to be terminated in other waysIdisposableInterface to avoid this overhead. This interface provides a back-up method to reclaim resources whose lifecycles are well known by programmers. In fact, this situation is quite likely to happen. Of course, if your object only uses a unique memory and therefore does not need to be terminated or disposed of at all, this is still a better case; but if you need to terminate, in addition, it is easy and practical to explicitly manage objects in many cases.IdisposableThe interface is a good way to avoid, at least to reduce the termination overhead.

In C #, this mode can be very useful:

Class X: idisposable {public X () {initialize resources }~ X () {release resources} public void dispose () {// This is the same as calling ~ X () Finalize (); // No need to finalize latersystem. gc. suppressfinalize (this );}};

Here, you can manually callDisposeYou do not need the recycler to keep the object alive or call the Terminator.

Http://www.microsoft.com/china/MSDN/library/netFramework/netframework/NFdotnetgcbasics.mspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.