[經典文章翻譯] [未完工] [9-6更新] 在.NET Framework中針對Real-Time技術的效能注意事項

來源:互聯網
上載者:User

原作者: Emmanuel Schanzer

 

總結:

這篇文章包括了在託管世界中工業級的各種技術以及對它們如何影響效能的技術解釋. 涉及到垃圾收集, JIT, Remoting, ValueTypes, 安全等方面.

 

概覽:

.NET運行時引入了多種旨在提高安全性, 易開發性, 高效能的進階技術. 作為一個研發人員, 理解這些技術的任何一個並在你的代碼中高效地使用這些技術都是比較重要的. Run Time提供的進階工具會使得建立健壯的應用程式變得更加容易, 但是如何讓應用程式飛的更快一點卻是(也一直是)研發人員的責任.

 

這篇白皮書會給你提供一個對.NET的工業級技術的更加深入的理解, 並協助你調整你的代碼使之啟動並執行更快. 注意, 這不是一篇規範表. 現在已經有很多實實在在的技術資訊了. 這篇文章的目的是聚焦效能問題來提供資訊, 也許不能回答你的每一個技術問題. 如果這裡找不到你的問題的答案, 我建議你再在MSDN線上文件庫中在多看看.

 

我將會討論到下面的技術, 並且對它們的目的以及為什麼他們會影響效能提供高層次的概述. 之後我會深入到一些底層的技術實現細節, 並使用範例程式碼來說明如何從每一項techonlgy中擷取高效能和高速度.

Garbage Collection
Thread Pool
The JIT
AppDomains
Security
Remoting
ValueTypes

 

垃圾收集

=======================

基礎

垃圾收集(GC)通過釋放不再被使用的對象的記憶體把程式員從釋放記憶體這種常見卻又難以debug其錯誤的任務中解放了出來. 一般一個對象的生存路徑如下代碼所示, 託管或非託管是一樣的:

            Foo a = new Foo();      // 為對象分配記憶體並初始化            ...a...                 // 使用該對象            delete a;               // 清除對象的狀態, 進行清理                                    // 釋放這個對象的記憶體

 

在native code中, 你需要自己來做所有的這一切. 忽略記憶體配置階段或清理階段都會導致不可預期的行為, 並且這種問題難以debug, 忘記釋放記憶體則會導致記憶體泄露. 在CLR, 記憶體配置跟我們剛剛看到的很接近. 如果我們添加GC-specific資訊的話, 我們會得到看起來非常相似的內容.

            Foo a = new Foo();      // 為對象分配記憶體並初始化            ...a...                 // 使用該對象(該對象是strong reachable的)            a = null;               // A對象變為unreachable了(out of scope, nulled, 等等)                                    // 最終, 對A對象的回收發生, 還有同時還要回收A的資源.                                    // 記憶體被回收掉

 

直到對象能夠被釋放之前, 在託管和非託管世界以上的步驟都是一樣的. 在native code, 你需要記住釋放在對象使用完畢後釋放掉它. 在managed code, 一旦對象變為unreachable, 那麼GC會回收它. 當然了, 如果你的resource需要在釋放上特別吃小灶的話(比如說關閉socket), GC就需要你的協助才能正確地進行處理. 你寫的代碼中, 在對象釋放之前進行清理工作這一條規則依然適用, 你可以使用Dispose()Finalize() 方法來這樣做. 我們稍後會談論這二者的區別.

 

如果你保留著一個指向某資源的指標, 那麼GC就不可能知道你是否將來還要使用這項資源. 這就意味著你在native code中使用的所有的顯式釋放對象的規則仍然適用, 但是絕大多數情況下, GC會為你處理掉一切. 如果說以前你要把百分一百的時間投入到記憶體管理上, 那麼現在你僅需要百分之五的時間來考慮記憶體管理了.

 

CLR的垃圾收集器是一個按代劃分的(generational), 標記並整理的(mark-and-compact )回收器. 它遵循以下的幾條原則, 這些原則能讓它獲得出色的效能. 首先, 短命的對象往往是較小的和會經常被訪問到的. GC把分配圖表劃分為幾個子圖表, 叫做generations(代), Generation能讓GC儘可能地花費較少的時間來進行回收. Gen 0包含年輕的, 經常被訪問的對象. 這些對象規模趨近於最小, 並且需要大概10毫秒來回收. 因為GC能夠再進行這次回收的時候忽略其他generation的回收, 所以它可以提供更高的效能. G1和G2是為了更大的, 更老的, 不會被頻繁回收的對象準備的. 當G1回收發生的時候, G0也被回收. G2的回收是一種完全的回收, 盡在這是GC會遍曆整個記憶體graph. 它還會智能地使用CPU緩衝, 通過這種技術能夠調整某個CPU之上的記憶體子系統. 對於native的記憶體配置來說, 這種最佳化不容易獲得, 但如果有這種最佳化的話, 就能夠協助提高你的應用程式的效能.

 

垃圾收集何時發生?

在需要分配記憶體的時候, GC會檢查是否需要進行回收. GC會查看可回收的記憶體的大小, 剩下的記憶體的大小, 以及每一個generation的大小, 然後使用一個啟發學習法方法來做決定. 直到一個回收發生, 對象的記憶體配置可以像C或C++一樣快, 甚至更快.

 

垃圾收集的時候做發生了些什麼?

讓我們一步步地看垃圾收集器在回收的時候都做了哪些步驟吧. GC維護著一個root的列表, 該列表內容指向GC的堆heap. 如果一個對象是活動的, 那麼就會有一個root指向它在堆中的位置. 堆中的對象還可以互相引用. 這張指標圖(reachability graph)是GC為了釋放記憶體而必須進行搜尋的. 事件發生的順序如下:

1. 託管堆中所有的記憶體配置塊都是連續的, 當剩下的一塊大小不足以應付一個請求的時候, 那麼GC就會被觸發了.

2. GC順著每一個root以及root之後的所有指標進行遍曆, 產生一個列表, 列表中的對象都是前面的遍曆所無法到達的.

3. 從root出發進行遍曆, 每一個無法到達的對象都被認為是可以回收的, 並且這些對象會為後面的回收而被進行標記.

             

4. 從reachability graph中移除掉對象, 使得很多個物件都可以回收了. 然而, 有些資源需要進行特別處理. 當你定義一個對象的時候, 你可以選擇為它定義Dispose() 方法或Finalize() 方法, 或者二者都有. 我們稍後會討論這二者的不同, 並且會討論什麼時候使用它們.

5. 回收的最後一步是記憶體整理階段. 所有正在被使用的對象都被移到一塊連續的記憶體塊上, 所有的指標以及root都會被更新.

6. 通過整理活動的對象並且更新可用記憶體的起始地址, GC保持了內用記憶體快的連續性. 如果有足夠空間進行記憶體配置, 那麼GC就會把控制轉交給應用程式. 如果還不能滿足, 那麼就報出exception, 類型為OutOfMemoryException

            

 

Object Cleanup

Some objects require special handling before their resources can be returned. A few examples of such resources are files, network sockets, or database connections. Simply releasing the memory on the heap isn't going to be enough, since you want these resources closed gracefully. To perform object cleanup, you can write a Dispose() method, a Finalize() method, or both.

A Finalize() method:

  • Is called by the GC
  • Is not guaranteed to be called in any order, or at a predictable time
  • After being called, frees memory after the next GC
  • Keeps all child objects live until the next GC

A Dispose() method:

  • Is called by the programmer
  • Is ordered and scheduled by the programmer
  • Returns resources upon completion of the method

Managed objects that hold only managed resources don't require these methods. Your program will probably use only a few complex resources, and chances are you know what they are and when you need them. If you know both of these things, there's no reason to rely on finalizers, since you can do the cleanup manually. There are several reasons that you want to do this, and they all have to do with the finalizer queue.

In the GC, when an object that has a finalizer is marked collectable, it and any objects it points to are placed in a special queue. A separate thread walks down this queue, calling the Finalize() method of each item in the queue. The programmer has no control over this thread, or the order of items placed in the queue. The GC may return control to the program, without having finalized any objects in the queue. Those objects may remain in memory, tucked away in queue for a long time. Calls to finalize are done automatically, and there is no direct performance impact from call itself. However, the non-deterministic model for finalization can definitely have other indirect consequences:

  • In a scenario where you have resources that need to be released at a specific time, you lose control with finalizers. Say you have a file open, and it needs to be closed for security reasons. Even when you set the object to null, and force a GC immediately, the file will remain open until its Finalize() method is called, and you have no idea when this could happen.
  • N objects that require disposal in a certain order may not be handled correctly.
  • An enormous object and its children may take up far too much memory, require additional collections and hurt performance. These objects may not be collected for a long time.
  • A small object to be finalized may have pointers to large resources that could be freed at any time. These objects will not be freed until the object to be finalized is taken care of, creating unnecessary memory pressure and forcing frequent collections.

The state diagram in Figure 3 illustrates the different paths your object can take in terms of finalization or disposal.

As you can see, finalization adds several steps to the object's lifetime. If you dispose of an object yourself, the object can be collected and the memory returned to you in the next GC. When finalization needs to occur, you have to wait until the actual method gets called. Since you are not given any guarantees about when this happens, you can have a lot of memory tied up and be at the mercy of the finalization queue. This can be extremely problematic if your object is connected to a whole tree of objects, and they all sit in memory until finalization occurs.

 

Choosing Which Garbage Collector to Use

The CLR has two different GCs: Workstation (mscorwks.dll) and Server (mscorsvr.dll). When running in Workstation mode, latency is more of a concern than space or efficiency. A server with multiple processors and clients connected over a network can afford some latency, but throughput is now a top priority. Rather than shoehorn both of these scenarios into a single GC scheme, Microsoft has included two garbage collectors that are tailored to each situation.

Server GC:

  • Multiprocessor (MP) Scalable, Parallel
  • One GC thread per CPU
  • Program paused during marking

Workstation GC:

  • Minimizes pauses by running concurrently during full collections

The server GC is designed for maximum throughput, and scales with very high performance. Memory fragmentation on servers is a much more severe problem than on workstations, making garbage collection an attractive proposition. In a uniprocessor scenario, both collectors work the same way: workstation mode, without concurrent collection. On an MP machine, the Workstation GC uses the second processor to run the collection concurrently, minimizing delays while diminishing throughput. The Server GC uses multiple heaps and collection threads to maximize throughput and scale better.

You can choose which GC to use when you host the run time. When you load the run time into a process, you specify what collector to use. Loading the API is discussed in the .NET Framework Developer's Guide. For an example of a simple program that hosts the run time and selects the server GC, take a look at the Appendix.

Myth: Garbage Collection Is Always Slower Than Doing It by Hand

Actually, until a collection is called, the GC is a lot faster than doing it by hand in C. This surprises a lot of people, so it's worth some explanation. First of all, notice that finding free space occurs in constant time. Since all free space is contiguous, the GC simply follows the pointer and checks to see if there's enough room. In C, a call to malloc() typically results in a search of a linked list of free blocks. This can be time consuming, especially if your heap is badly fragmented. To make matters worse, several implementations of the C run time lock the heap during this procedure. Once the memory is allocated or used, the list has to be updated. In a garbage-collected environment, allocation is free, and the memory is released during collection. More advanced programmers will reserve large blocks of memory, and handle allocation within that block themselves. The problem with this approach is that memory fragmentation becomes a huge problem for programmers, and it forces them to add a lot of memory-handling logic to their applications. In the end, a garbage collector doesn't add a lot of overhead. Allocation is as fast or faster, and compaction is handled automatically—freeing programmers to focus on their applications.

In the future, garbage collectors could perform other optimizations that make it even faster. Hot spot identification and better cache usage are possible, and can make enormous speed differences. A smarter GC could pack pages more efficiently, thereby minimizing the number of page fetches that occur during execution. All of these could make a garbage-collected environment faster than doing things by hand.

Some people may wonder why GC isn't available in other environments, like C or C++. The answer is types. Those languages allow casting of pointers to any type, making it extremely difficult to know what a pointer refers to. In a managed environment like the CLR, we can guarantee enough about the pointers to make GC possible. The managed world is also the only place where we can safely stop thread execution to perform a GC: in C++ these operations are either unsafe or very limited.

Tuning for Speed

The biggest worry for a program in the managed world is memory retention. Some of the problems that you'll find in unmanaged environments are not an issue in the managed world: memory leaks and dangling pointers are not much of a problem here. Instead, programmers need to be careful about leaving resources connected when they no longer need them.

The most important heuristic for performance is also the easiest one to learn for programmers who are used to writing native code: keep track of the allocations to make, and free them when you're done. The GC has no way of knowing that you aren't going to use a 20KB string that you built if it's part of an object that's being kept around. Suppose you have this object tucked away in a vector somewhere, and you never intend to use that string again. Setting the field to null will let the GC collect those 20KB later, even if you still need the object for other purposes. If you don't need the object anymore, make sure you're not keeping references to it. (Just like in native code.) For smaller objects, this is less of a problem. Any programmer that's familiar with memory management in native code will have no problem here: all the same common sense rules apply. You just don't have to be so paranoid about them.

The second important performance concern deals with object cleanup. As I mentioned earlier, finalization has profound impacts on performance. The most common example is that of a managed handler to an unmanaged resource: you need to implement some kind of cleanup method, and this is where performance becomes an issue. If you depend on finalization, you open yourself up to the performance problems I listed earlier. Something else to keep in mind is that the GC is largely unaware of memory pressure in the native world, so you may be using a ton of unmanaged resources just by keeping a pointer around in the managed heap. A single pointer doesn't take up a lot of memory, so it could be a while before a collection is needed. To get around these performance problems, while still playing it safe when it comes to memory retention, you should pick a design pattern to work with for all the objects that require special cleanup.

The programmer has four options when dealing with object cleanup:

1. Implement Both

This is the recommended design for object cleanup. This is an object with some mix of unmanaged and managed resources. An example would be System.Windows.Forms.Control. This has an unmanaged resource (HWND) and potentially managed resources (DataConnection, etc.). If you are unsure of when you make use of unmanaged resources, you can open the manifest for your program in ILDASM and check for references to native libraries. Another alternative is to use vadump.exe to see what resources are loaded along with your program. Both of these may provide you with insight as to what kind of native resources you use.

The pattern below gives users a single recommended way instead of overriding cleanup logic (override Dispose(bool)). This provides maximum flexibility, as well as catch-all just in case Dispose() is never called. The combination of maximum speed and flexibility, as well as the safety-net approach make this the best design to use.

Example:

public class MyClass : IDisposable { public void Dispose() { Dispose(true); GC.SuppressFinalizer(this); } protected virtual void Dispose(bool disposing) { if (disposing) { ... } ... } ~MyClass() { Dispose(false); } }

 

2. Implement Dispose() Only

This is when an object has only managed resources, and you want to make sure that its cleanup is deterministic. An example of such an object is System.Web.UI.Control.

Example:

public class MyClass : IDisposable { public virtual void Dispose() { ... }

 

3. Implement Finalize() Only

This is needed in extremely rare situations, and I strongly recommend against it. The implication of a Finalize() only object is that the programmer has no idea when the object is going to be collected, yet is using a resource complex enough to require special cleanup. This situation should never occur in a well-designed project, and if you find yourself in it you should go back and find out what went wrong.

Example:

public class MyClass { ... ~MyClass() { ... }

 

4. Implement Neither

This is for a managed object that points only to other managed objects that are not disposable nor to be finalized.

 

Recommendation

The recommendations for dealing with memory management should be familiar: release objects when you're done with them, and keep an eye out for leaving pointers to objects. When it comes to object cleanup, implement both a Finalize() and Dispose() method for objects with unmanaged resources. This will prevent unexpected behavior later, and enforce good programming practices

The downside here is that you force people to have to call Dispose(). There is no performance loss here, but some people might find it frustrating to have to think about disposing of their objects. However, I think it's worth the aggravation to use a model that makes sense. Besides, this forces people to be more attentive to the objects they allocate, since they can't blindly trust the GC to always take care of them. For programmers coming from a C or C++ background, forcing a call to Dispose() will probably be beneficial, since it's the kind of thing they are more familiar with.

Dispose() should be supported on objects that hold on to unmanaged resources anywhere in the tree of objects underneath it; however, Finalize() need only be placed only on those objects that are specifically holding on to these resources, such as an OS Handle or unmanaged memory allocation. I suggest creating small managed objects as "wrappers" for implementing Finalize() in addition to supporting Dispose(), which would be called by the parent object's Dispose(). Since the parent objects do not have a finalizer, the entire tree of objects will not survive a collection regardless of whether or not Dispose() was called.

A good rule of thumb for finalizers is to use them only on the most primitive object that requires finalization. Suppose I have a large managed resource that includes a database connection: I would make it possible for the connection itself to be finalized, but make the rest of the object disposable. That way I can call Dispose() and free the managed portions of the object immediately, without having to wait for the connection to be finalized. Remember: use Finalize() only where you have to, when you have to.

Note C and C++ Programmers: the Destructor semantic in C# creates a finalizer, not a disposal method!

 

線程池

=======================

 

 

JIT

=======================

 

 

AppDomain

=======================

 

 

Security

=======================

 

 

Remoting

=======================

 

 

ValueTypes

=======================

 

 

原文地址:

Performance Considerations for Run-Time Technologies in the .NET Framework

http://msdn.microsoft.com/en-us/library/ms973838

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.