"C # Advanced Series" 20 managed heap and garbage collection

Source: Internet
Author: User

Managed Heap Foundation

Creating an object is typically done by calling IL instruction newobj to allocate memory, and then initializing memory, which is the instance constructor.

Then after the object is used, the state of the resource is destroyed to clean up, and the memory is freed by the garbage collector.

In addition to avoiding the erroneous use of memory that has been freed, the managed heap also reduces memory leaks, most of which do not require resource cleanup, and the garbage collector frees resources automatically.

Of course there is a need to clean up immediately, such as some types that contain native resources (such as files, sockets, database connections, etc.), and you can call a Dispose method in these classes. (Some classes, of course, encapsulate this method, possibly other names such as close of the database connection)

Allocating resources on the managed heap

The CLR requires that all objects are allocated from the managed heap.

When the process is initialized, the CLR outlines an address space zone as the managed heap. The CLR also maintains a pointer, nextobjptr. The pointer points to the next object's assignment in the heap is unknown.

When a region is filled with non-garbage objects, the CLR allocates more areas, and the process repeats until the entire process address space is filled. 32-bit processes allocate up to 8TB of 1.5g,64-bit processes.

When an object is created, the number of bytes required for that object Type field (including the base class) is evaluated first, plus the number of bytes required for the object's overhead (that is, the type object pointer and the synchronization block index).

The CLR then checks for memory in the zone for the size of bytes required to allocate the object. If the managed heap has one, then the object is placed at the address pointed to by Nextobjptr, and the nextobjptr is added with the number of bytes occupied by the object to get the new value, the address at which the next object is placed.

Recycling resources through the garbage collector (GC)

A garbage collection is performed when the CLR discovers that there is not enough memory allocated for the object when it is created.

When the CLR is garbage collected, it pauses all threads first.

Mark Stage: The CLR then iterates through all the reference objects in the heap, setting one of the Synchronized block index fields to 0, indicating that all objects are to be deleted. Then examine all the active roots (that is, the fields of all reference types and the parameters and local variables of the method) to see which objects they refer to. Any root that references an object on the heap, the CLR flags that object, sets the bit in the Synchronization block index field to 1, and if the object is already marked as 1, the object's field is no longer checked again. The referenced object, labeled 1, is referred to as unreachable, and is not accessible with a mark of 0. At this point the CLR knows which objects can be deleted and which objects cannot be deleted.

Compression phase: The CLR moves the memory location of the tagged objects in the heap (and the reference to all the root of the object will naturally change), so that the tagged objects are tightly connected, occupying contiguous memory space. This not only reduces the working set of the application, improves access performance, but also gets a lot of unused memory space and resolves the problem of memory fragmentation.

Finally, all threads are resumed.

Objects that are referenced by a static field persist until the AppDomain that is used to load the type is unloaded. A common reason for a memory leak is to have a static field refer to a collection object and then add data items to the collection without stopping. A static field keeps the collection object alive, and the collection object keeps all data items alive. Therefore, you should avoid using static fields as much as possible. (or by referring to the previous play, when we do not use static variables, we can immediately set to NULL, then garbage will be recycled).

There is a magical garbage collection exception to--timer. The reason is that it will call the callback function every once in a while, but according to the garbage collection method that you learned before, you can know that when the timer variable leaves the scope and no other function references the timer object, the timer will be reclaimed when garbage is collected. You will not be able to execute the callback function. (So be careful with the timer, there is such a big pit)

Generation: Improving performance

The CLR's GC is a generation-based garbage collector. It makes the following assumptions about the code:

    • The more new The object, the shorter the life span
    • The older the object, the longer the lifetime
    • Recovering part of a heap faster than recovering the entire heap

No. 0 Generation: New objects added to the heap are called No. 0 generation objects, and the garbage collector has never checked them.

1th Generation: The No. 0 generation objects are garbage collected once, but are not released as garbage, then they are placed in the 1th-generation object area together during the compression phase.

2nd generation: The 1th generation objects have been garbage collected again, but they have not been released as garbage, so they will be placed in the 2nd-generation object area together in the compression phase. There is no 3rd generation, and the 2nd generation of objects that have gone through 2 and more than 2 garbage collections.

No. 0-generation memory area is full of garbage collection, this time not only will reclaim the No. 0 generation of the region, but also to determine whether the 1th generation area is also full, full of also reclaim 1th generation, dissatisfied with the words of the 1th in the inside of an unreachable object, then will not be recycled 1th generation.

When the CLR initializes, the memory budget is selected for each of the three generations to determine when it should be recycled. But the CLR's garbage collector is self-tuning.

Other words

If the garbage collector finds that there are few objects surviving the No. 0 generation, it will reduce the budget for the No. 0 generation, so that garbage collection will occur more frequently, but the garbage collector does less each time, which reduces the working set. If no one survives, even compression is waived.

If the garbage collector finds that there are many objects surviving the No. 0 generation, then the No. 0 generation of the budget is increased, so that garbage collection happens infrequently, but the garbage collector recycles much more memory each time. (If you do not recycle enough memory, the garbage collector performs a full collection, and if there is not enough memory, it throws a OutOfMemoryException exception).

The above is the No. 0 generation example, and the 1th and 2 generations are also the same.

Garbage collection Trigger conditions

The CLR triggers a GC when it detects that the No. 0 generation is over budget, which is the most common trigger condition for GC, and other triggers are as follows:

    • The code shows the static collect method that calls System.GC
    • Windows reports low Memory conditions
    • The CLR is uninstalling the AppDomain
    • The CLR is shutting down

Large Object

The CLR divides objects into large objects and small objects, with 85000-byte boundaries.

Large objects are not allocated in the address space of small objects, but are allocated elsewhere in the process address space.

The current version of the GC does not compress large objects because they are too expensive to move through memory. (May cause space debris)

Large objects are always the 2nd generation, so you can only generate large objects for resources that need to survive for a long time, or if the large objects that survive for a short time are placed in the second generation, because the previous mention of reclaiming too much memory will reduce the generation's budget, resulting in more frequent recycling of the 2nd generation, which can compromise performance.

Garbage collection Mode

The CLR chooses a GC mode when it starts, and the pattern does not change until the process terminates:

    • Workstation mode
      • This mode optimizes the GC for the client application. The GC causes a low latency and application thread hangs for a short time, avoiding anxiety for the user. In this mode, the GC assumes that other applications running on the machine do not consume too much CPU resources.
    • Server mode
      • This mode optimizes GC for server-side applications. The main optimizations are throughput and resource utilization. The GC assumes that no other application is running on the machine and assumes that all CPUs on the machine can be used to assist with GC completion. This mode causes the managed heap to be divided into several regions (sections), one per CPU. When garbage collection starts, the garbage collector runs a special thread on each CPU, and each thread and other threads recycle its own zone concurrently. For server applications that have a consistent behavior with worker threads (worker thread), concurrent recycling can be good. This feature requires the application to run on a multi-CPU computer so that the thread can actually work at the same time, resulting in a performance gain.

The application runs in workstation GC mode by default. Server applications that host the CLR, such as ASP. NET and SQL Server, can request the CLR to load a "server" GC, but if it is running on a single-processor machine, the CLR will always use the workstation GC mode.

Standalone applications can tell the CLR to use server mode in the configuration file, plus the following configuration items:

< Configuration >  < Runtime >    <  enabled= "true"/>  </runtime>  </configuration>

In addition to these two modes, the GC also supports two seed modes: concurrent (default) and non-concurrency.

In concurrency mode, the GC has an extra thread that can tag objects concurrently at run time.

While another thread is deciding whether to compress the object, the GC can be more inclined to decide not to compress, to enhance performance, but to increase the application working set. Using the concurrent garbage collector consumes more memory than non-concurrency.

Add the following configuration item to tell the CLR to use non-concurrency mode:

< Configuration >  < Runtime >    <  enabled= "false"/>  </runtime  ></configuration>

Use types that require special cleanup

Most types require only memory, but some types also require native resources. For example, the System.IO.FileStream type requires opening a file (native resource) and saving a handle to the file.

When the type that contains the native resource is GC, the GC reclaims the memory that the object uses in the managed heap. However, this causes a leak of the native resource (which the GC knows nothing about), so the CLR provides a mechanism called finalization that allows the object to execute some code after it is judged to be garbage, but before the object memory is reclaimed.

Any type that wraps a native resource (file, network connection, socket, mutex) supports finalization. When the CLR determines that an object is unreachable, the object will terminate itself, releasing its wrapped native resources. The GC then reclaims the object from the managed heap.

C # syntax is similar to destructors, but represents different meanings

 Public class Troy {        ~Troy () {            /// The code here is the code that was executed before garbage collection, this code is placed in a try block, and the finally part is base. Finalize        }    }

This syntax finally generates a method called Finalize in the IL code.

Objects considered garbage are not called the Finalize method until garbage collection is complete, so the memory of these objects is not immediately reclaimed because the Finalize method may be executing code that accesses the fields.

An object that can be recycled must survive when it is reclaimed, causing it to be promoted to another generation, so that the object will live longer than the normal time. This increases memory consumption, so you should try to avoid the end.

The internal principle of the end

When a new object is created, memory is allocated in the heap. If the type of the object defines a finalize method, a pointer to that object is placed into a list of endpoints before the instance constructor of the type is called.

The end list is an internal data structure that is controlled by the garbage collector, and each item in the list points to an object-the Finalize method that should be called before the memory of the object is reclaimed.

The mark stage will scan the end list every time the garbage object is reclaimed, and if there is a reference to the garbage object, the reference is removed and appended to the freachable queue . (At this point the object will no longer be considered garbage, it cannot reclaim its memory, it is called an object resurrection.)

The freachable queue is also an internal data structure for the garbage collector, and each reference in the queue points to an object that is ready to call the Finalize method.

The CLR uses a special, high-priority dedicated thread to invoke the Finalize method to avoid deadlocks.

If the freachable queue is empty, then this thread sleeps, and once it is not empty, the thread is awakened, each item is removed from the queue, and the Finalize method of each object is called at the same time.

Then go into the compression phase and elevate these resurrected objects to the next generation.

Then empty the freachable queue and execute the Finalize method for each object.

The next time the garbage collection is performed, because the end list already has no pointers to these objects, they are now considered real garbage and will be released.

During the entire process, two garbage collection was performed to free up memory, and in practice it may require more than two garbage collection, as objects may be promoted to another generation.

Manually monitoring and controlling the lifetime of an object

The CLR provides a GC handle table for each AppDomain that allows applications to monitor and manually control the lifetime of an object. This is too 6, the feeling can not be used, use the time to come back to see it.

Ps:

The last two chapters efficiency is really slow, on the one hand because the double Hugh did not read and some unexpected situation, on the other hand is also because the CLR has begun the core mechanism of the journey, many of the things really did not hear, feel the difficulty began to increase.

In the process, the keyboard is inexplicably broken, and two times shut down to write half of the blog. Today only found that the original force shutdown after the boot, the browser has written half of the blog can be restored.

"C # Advanced Series" 20 managed heap and garbage collection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.