Effective C# 原則32:選擇小而內聚的程式集(譯)

來源:互聯網
上載者:User

Effective C# 原則32:選擇小而內聚的程式集   
Item 32: Prefer Smaller, Cohesive Assemblies

這一原則實際應該取這個名字:“應該建立大小合理而且包含少量公用類型的程式集”。但這太沉長了,所以就以我認為最常見的錯誤來命名:開發人員總是把所有的東西,除了廚房裡水溝以外(譯註:誇張說法,kitchen sink可能是個口語詞,沒能查到是什麼意思,所以就直譯了。),都放到一個程式集。這不利於重用其中的組件,也不利於系統中小部份的更新。很多以二進位組件形式存在的小程式集可以讓這些都變得簡單。

然而這個標題對於程式集的內聚來說也很醒目的。程式集的內聚性是指概念單元到單個組件的職責程度。彙總組件可以簡單的用一句話概括,你可以從很多.Net的FCL程式集中看到這些。有兩個簡單的例子:System.Collections程式集就是負責為相關對象的有序集提供資料結構,而System.Windows.Forms程式集則提供Windows控制項類的模型。Web form和Windows Form在不同的程式集中,因為它們不相關。你應該用同樣的方式,用簡單的一句話來描述你的程式集。不要玩花樣:一個MyApplication程式集提供了你想要的一切內容。是的,這也是簡單的一句,但這也太刁懶了吧,而且你很可能在My2ndApplication(我想你很可能會要重用到其中的一些內容。這裡“其中的一些內容”應該放到一個獨立的程式集中。)程式集並不須要使用所有的功能。

你不應該只用一個公用類來建立一個程式程式集。應該有一個折衷的方法,如果你太偏激,建立了太多的程式集,你就失去了使用封裝的一些好處:首先就是你失去了使用內部類型的機會,內部類型是在一個程式集中與封裝(打包)無關的公用類(參見原則33)(譯註:簡單的說,內部類型就是只能在一個公用的程式集中訪問類,程式集以外限制訪問)。JIT編譯器可以在一個程式集內有很的內聯效率,這比起在多程式集中穿梭效率要高得多。這就是說,在一個程式集中放置一些相關的類型對你是有好處的。我們的目標就是為我們的組件建立大小最合適的程式集。這一目標很容易實現,就是一個組件應該只有一個職責。

在某些情況下,一個程式集就是類的二進位表現形式,我們用類來封裝演算法和儲存資料。只有公用的介面才能成為“官方”的合約,也就是只有公用介面才能被使用者訪問。同樣,程式集為相關類提供二進位的包,在這個程式集以外,只有公用和受保護的類是可見的。工具類可以是程式集的內部類。確實,它們對於私人的嵌套類來說它們應該具有更更寬的存取範圍,但你有一個機制可以共用組件內部通用的實現,而不用暴露這個實現給所有的使用者。那就是封裝相關類,然後從程式集中分離成多個程式。

其實,使用多程式集可以讓很多不同布署選項變得很簡單。考慮一個三層應用程式,一部份程式以智能用戶端的形式在運行,而另一部份則是在伺服器上運行。你在用戶端上提供了一些驗證原則,用於確保使用者反饋的資料輸入和修改是正確的。而在伺服器上你又要重複這些原則,而且複合一些驗證以保證驗證更嚴格。而這些在伺服器端的業務原則應該是一個完整的集合,而在每個用戶端上只是一個子集。

確實,你也可以通過重用源檔案來為用戶端和伺服器的業務原則建立不同的程式集,但這對你的布署機制來說會成為一個複雜的問題。當你更新這些業務原則時,你就有兩個安裝要完成。相反,你可以從嚴格的伺服器端驗證中分離一部分驗證,封裝成不同的程式集放置到用戶端。這樣,你就重用封裝成程式集的二進位對象。這比起重用代碼或者資源,重新編譯成多個程式集要好得多。

做為一個程式,應該是一個包含相關功能的組織圖庫。這已經是大家熟悉的了,但在實際操作中卻很難實現。實際上,對於一個分布式應用程式,你可能不能提前知道哪些類應該同時分布到伺服器和用戶端上。即使可能,服務端和用戶端的功能也有可能是流動的;你將來很有可能要面臨兩邊都要處理的地步。通過儘可能能的讓程式集小,你就有可能更簡單的重新布署伺服器和用戶端。程式集是應用程式的二進位塊,對於一個工作的應用程式來說,很容易添加一個新的組件外掛程式。如果你不小心出了什麼錯誤,建立過多的程式集要比個別很太的程式要容易處理得多。

我經常程式集和二進位組件類似的看作是Lego。你可以很容易的抽出一個Lego然後用另一個代替。同樣的,對於有相同介面的程式集來說,你應該可以很容易的把它抽出來然後用一個新的來替換。而且程式其它部份應該可以繼續像往常一樣運行。這和Lego有點像,如果你的所有參數和傳回值都是介面,那麼任何一個程式集就可以很容易的用另一個有相同介面的來代替(參見原則19)。

更小的程式集同樣可以讓你對程式啟動時的開銷進行分期處理。更大的程式要花上更多的CUP時間來載入,以及更多的時間來編譯必須的IL到機器指令。應該只在啟動時JIT一些必須的內容,而程式集是整個載入的,而且CLR要為程式集中的每個方法儲存一個存根。

稍微休息一下,而且確保我們不會走到極端。這一原則是確保你不會建立出單個單片電路的程式,而是建立基於二進位的整體系統,而且是可重用的組件。不要參考這一原則而走到另一個極端。一個基於太多小程式集的大型應用程式的開銷是相關的。如果你的程式使用了太多的程式集,那麼在程式集之間的穿梭會產生更多的開銷。在載入更多的程式集並轉化IL為機器指令時,CLR的載入器有一點額外的工作要完成,那就是調整函數入口地址。

同樣,以程式集之間穿梭時,安全性檢查也會成為一個額外的開銷。同一個程式集中的所有的代碼具有相同的信任層級(並不是同樣的存取層級,而是可信層級)。 無論何時,只要代碼訪問超出了一個程式集,CLR都要完成一些安全驗證。程式花在程式集間穿梭的時間越少,相對程式的效率就更高。

這些與效能相關的說明並沒有一個是勸阻你把一個大程式集分離成小程式集的。效能的損失是其次的,C#和.Net的設計是以組件為核心思想的,更好的伸縮性通常更有價值。

那麼,你決定一個程式集中放多少代碼或者多少類呢?更重要的是,你是如何決定哪些代碼應該在一個程式集中?這很大程度上取決於實際的應用程式,因此這並沒有一個確論。我這裡有一個推薦:通過觀察所有的公用類開始,用一個公用基類合并這些類到一個程式集中。然後添加一些工具類到這個程式集中,這些工具類主要是負責提供所有相關類的功能。把相關的公用介面封裝到一個獨立的程式集中。最後一步,查看那些在應用程式中橫向訪問的對象,這些是有可能成為廣泛使用的工具程式集的候選對象,它們可能會包含在應用程式的工具庫中。

最後的結果就是,你的組件只在一個簡單的相關集合中,這個集合中只有一些必須的公用類,以及一些工具類來支援它們。這樣,你就建立了一個足夠小的程式集,而且很容易從更新和重用中得到好處,同時也在最小化多個程式集相關的開銷。一個設計好的內聚組件可以用一句話來概括。例如,“Common.Storage.dll 用管理所有離線使用者資料緩衝以及使用者佈建。”就描述了一低內聚的組件。相反,做兩個組件:“Common.Data.dll 管理離線資料緩衝。Common.Settings.dll 系統管理使用者設定。” 當你把它們分開後,你可能還要使用一個第三方組件:“Common.EncryptedStorage.dll 為本地加密儲存管理檔案系統IO” ,這樣你就可以獨立的更新這三個組件了。

小,是一個相對的條件。Mscorlib.dll就大概有2MB,System.Web. RegularExpressions.dll卻只有56KB。但它們都滿足小的核心設計目標,重用程式集:它們都包含相關類和介面的集合。絕對大小的不同應該根據功能的不同來決定:mscorlib.dll包含了所有應用程式中要使用的最底層的類。而System.Web.RegularExpressions.dll卻很特殊,它只包含一些在Web控制項中要使用的Regex類。這就建立了兩種不同類型的組件:一個就是小,而大的程式集則是集中在特殊的功能上,廣泛應用的程式集包含通用的功能。不論哪種情況,應該它們儘可能合理的小,直到不能再小。
======================

 
       

Item 32: Prefer Smaller, Cohesive Assemblies
This item should really be titled "Build Assemblies That Are the Right Size and Contain a Small Number of Public Types." But that's too wordy, so I titled it based on the most common mistake I see: developers putting everything but the kitchen sink in one assembly. That makes it hard to reuse components and harder to update parts of a system. Many smaller assemblies make it easier to use your classes as binary components.

The title also highlights the importance of cohesion. Cohesion is the degree to which the responsibilities of a single component form a meaningful unit. Cohesive components can be described in a single simple sentence. You can see this in many of the .NET FCL assemblies. Two examples are: the System.Collections assembly provides data structures for storing sets of related objects and the System.Windows.Forms assembly provides classes that model Windows controls. Web forms and Windows Forms are in different assemblies because they are not related. You should be able to describe your own assemblies in the same fashion using one simple sentence. No cheating: The MyApplication assembly provides everything you need. Yes, that's a single sentence. But it's also lazy, and you probably don't need all of that functionality in My2ndApplication (though you'd probably like to reuse some of it. That "some of it" should be packaged in its own assembly).

You should not create assemblies with only one public class. You do need to find the middle ground. If you go too far and create too many assemblies, you lose some benefits of encapsulation: You lose the benefits of internal types by not packaging related public classes in the same assembly (see Item 33). The JIT compiler can perform more efficient inlining inside an assembly than across assembly boundaries. This means that packaging related types in the same assembly is to your advantage. Your goal is to create the best-sized package for the functionality you are delivering in your component. This goal is easier to achieve with cohesive components: Each component should have one responsibility.

In some sense, an assembly is the binary equivalent of class. We use classes to encapsulate algorithms and data storage. Only the public interfaces are part of the official contract, so only the public interfaces are visible to users. In the same sense, assemblies provide a binary package for a related set of classes. Only public and protected classes are visible outside an assembly. Utility classes can be internal to the assembly. Yes, they are more visible than private nested classes, but you have a mechanism to share common implementation inside that assembly without exposing that implementation to all users of your classes. Partitioning your application into multiple assemblies encapsulates related types in a single package.

Second, using multiple assemblies makes a number of different deployment options easier. Consider a three-tiered application, in which part of the application runs as a smart client and part of the application runs on the server. You supply some validation rules on the client so that users get feedback as they enter or edit data. You replicate those rules on the server and combine them with other rules to provide more robust validation. The complete set of business rules is implemented at the server, and only a subset is maintained at each client.

Sure, you could reuse the source code and create different assemblies for the client and server-side business rules, but that would complicate your delivery mechanism. That leaves you with two builds and two installations to perform when you update the rules. Instead, separate the client-side validation from the more robust server-side validation by placing them in different assemblies. You are reusing binary objects, packaged in assemblies, rather than reusing object code or source code by compiling those objects into the multiple assemblies.

An assembly should contain an organized library of related functionality. That's an easy platitude, but it's much harder to implement in practice. The reality is that you might not know beforehand which classes will be distributed to both the server and client portions of a distributed application. Even more likely, the set of server- and client-side functionality will be somewhat fluid; you'll move features between the two locations. By keeping the assemblies small, you'll be more likely to redeploy more easily on both client and server. The assembly is a binary building block for your application. That makes it easier to plug a new component into place in a working application. If you make a mistake, make too many smaller assemblies rather than too few large ones.

I often use Legos as an analogy for assemblies and binary components. You can pull out one Lego and replace it easily; it's a small block. In the same way, you should be able to pull out one assembly and replace it with another assembly that has the same interfaces. The rest of the application should continue as if nothing happened. Follow the Lego analogy a little farther. If all your parameters and return values are interfaces, any assembly can be replaced by another that implements the same interfaces (see Item 19).

Smaller assemblies also let you amortize the cost of application startup. The larger an assembly is, the more work the CPU does to load the assembly and convert the necessary IL into machine instructions. Only the routines called at startup are JITed, but the entire assembly gets loaded and the CLR creates stubs for every method in the assembly.

Time to take a break and make sure we don't go to extremes. This item is about making sure that you don't create single monolithic programs, but that you build systems of binary, reusable components. You can take this advice too far. Some costs are associated with a large program built on too many small assemblies. You will incur a performance penalty when program flow crosses assembly boundaries. The CLR loader has a little more work to do to load many assemblies and turn IL into machine instructions, particularly resolving function addresses.

Extra security checks also are done across assembly boundaries. All code from the same assembly has the same level of trust (not necessarily the same access rights, but the same trust level). The CLR performs some security checks whenever code flow crosses an assembly boundary. The fewer times your program flow crosses assembly boundaries, the more efficient it will be.

None of these performance concerns should dissuade you from breaking up assemblies that are too large. The performance penalties are minor. C# and .NET were designed with components in mind, and the greater flexibility is usually worth the price.

So how do you decide how much code or how many classes go in one assembly? More important, how do you decide which code goesin an assembly? It depends greatly on the specific application, so there is not one answer. Here's my recommendation: Start by looking at all your public classes. Combine public classes with common base classes into assemblies. Then add the utility classes necessary to provide all the functionality associated with the public classes in that same assembly. Package related public interfaces into their own assemblies. As a final step, look for classes that are used horizontally across your application. Those are candidates for a broad-based utility assembly that contains your application's utility library.

The end result is that you create a component with a single related set of public classes and the utility classes necessary to support it. You create an assembly that is small enough to get the benefits of easy updates and easier reuse, while still minimizing the costs associated with multiple assemblies. Well-designed, cohesive components can be described in one simple sentence. For example, "Common.Storage.dll manages the offline data cache and all user settings" describes a component with low cohesion. Instead, make two components: "Common.Data.dll manages the offline data cache. Common.Settings.dll manages user settings." When you've split those up, you might need a third component: "Common.EncryptedStorage.dll manages file system IO for encrypted local storage." You can update any of those three components independently.

Small is a relative term. Mscorlib.dll is roughly 2MB; System.Web. RegularExpressions.dll is merely 56KB. But both satisfy the core design goal of a small, reusable assembly: They contain a related set of classes and interfaces. The difference in absolute size has to do with the difference in functionality: mscorlib.dll contains all the low-level classes you need in every application. System.Web.RegularExpressions.dll is very specific; it contains only those classes needed to support regular expressions in Web controls. You will create both kinds of components: small, focused assemblies for one specific feature and larger, broad-based assemblies that contain common functionality. In either case, make them as small as what's reasonable, but not smaller.

 

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.