Detailed description. NET performance Improvements

Source: Internet
Author: User
In. NET 4.6, there are some CLR features related to performance improvements, some of which will automatically take effect, while others, such as SIMD and asynchronous local storage (async local Storage), require some change in the way you write your app.

Simd

The Mono team has been proud of their support for SIMD, a single instruction stream multi-stream feature. SIMD is a CPU instruction set that can perform the same operation on up to 8 values at a time. And with the introduction of. NET CLR version 4.6, Windows developers are finally able to use this feature.

In order to actually observe the effect of SIMD, you can refer to this example. Suppose you need to add two arrays in this form through c[i] = A[i] + b[i] to get a third array. By using SIMD, you can write code in the following ways:

for (int i = 0; i < size; i + = Vector.count) {     Vectorv = new Vector (a,i) + new vector (b,i);     V.copyto (C,i); }

Please note how this loop is pressed Vector<int> The value of count is incremented, depending on the type of CPU, it may be 4 or 8. The. NET JIT compiler generates the corresponding code based on the different CPUs, with a value of 4 or 8 for batch addition.

This approach looks cumbersome, so Microsoft also offers a range of ancillary classes, including:

    • MATRIX3X2 structure

    • MATRIX4X4 structure

    • Plane structure

    • Quaternion structure

    • Vector class

    • Vector (T) structure

    • VECTOR2 structure

    • VECTOR3 structure

    • VECTOR4 structure

Assembly uninstallation

I'm afraid most developers don't know this:. NET is often loaded two times for the same assembly. The condition in which this happens is. NET first loads the Il version of an assembly and then loads the NGen version of the same assembly (that is, the precompiled version). This is a significant waste of physical memory, especially for large 32-bit applications such as Visual Studio.

In. NET 4.6, once the CLR loads the NGen version of an assembly, it automatically empties the memory occupied by the corresponding IL version.

Garbage collection

Earlier, we discussed the garbage collection latency patterns introduced in. NET 4.0, which is much more reliable than having the GC stop for a period of time, but for many GC scenarios, this approach is still not complete.

In. NET 4.6, you will be able to temporarily abort the operation of the garbage collector in a more sophisticated way, and the new Trystartnogcregion method allows you to specify how much memory is needed in small objects and in the heap of large objects.

If there is an out-of-memory condition, the runtime will either return False or stop running until sufficient memory has been cleared through GC cleanup. You can control this behavior by passing in a token for trystartnogcregion, and if you successfully enter a non-GC zone (GC is not allowed before the process ends), you must call the Endnogcregion method at the end of the process.

The official documentation does not indicate whether the method is thread-safe, but given how the GC works, you should try to avoid having two processes trying to change the GC state at the same time.

Another improvement for GC is the way it handles pinned objects (that is, objects that cannot be moved after they are allocated). Although the description of this aspect in the document is somewhat vague, when you pin the position of an object, the position of its adjacent objects is often fixed. Rich Lander wrote in the article:

The GC will process the pinned object in a more optimized manner, so the GC can compress the memory around the pinned object more efficiently. For large-scale applications that use a large number of pin patterns, this change will greatly improve the performance of your application.

The GC also showed better intelligence on how to use memory in earlier generations, and rich continues to write:

The way the 1th-generation objects are promoted to the 2nd generation has also been improved to use memory more efficiently. Before allocating a new memory space to a generation, the GC tries to use the available space first. At the same time, new algorithms are used to create objects using the free space area so that the newly allocated space is much closer to the size of the object than before.

Asynchronous local Storage

The last improvement has no direct relation with performance, but it can still achieve the optimized effect by using effective utilization. In an age when asynchronous APIs are not yet popular, developers can use thread-local storage (TLS) to cache information. TLS is like a global object for a particular thread, which means that you can directly access the context information and cache it without having to explicitly pass some kind of context object.

In async/await mode, thread-local storage becomes useless. Because each time an await is called, it is possible to jump to another thread. And even if you get away with this, other code might jump into your thread and interfere with the information in TLS.

New version of the. NET introduces the asynchronous local storage (ALS) mechanism to solve this problem, the ALS is semantically equivalent to thread-local storage, but it is able to jump accordingly with the call to await. This feature is implemented through the Asynclocal generic class, which internally invokes the CallContext object for saving data.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.