Effective C# 原則6:區別實值型別資料和參考型別資料

來源:互聯網
上載者:User

Effective C# 原則6:區別實值型別資料和參考型別資料
Item 6: Distinguish Between Value Types and Reference Types

實值型別資料還是參考型別資料?結構還是類?什麼你須要使用它們呢?這不是C++,你可以把所有類型都定義為實值型別,並為它們做一個引用。這也不是Java,所有的類型都是實值型別。你在建立每個類型執行個體時,你必須決定它們以什麼樣的形式存在。這是一個為了取得正確結果,必須在一開始就要面對的重要決定。(一但做也決定)你就必須一直面對這個決定給你帶來的後果,因為想在後面再對它進行改動,你就不得不在很多細小的地方強行添加很多代碼。當你設計一個類型時,選擇struct或者class是件簡單的小事情,但是,一但你的類型發生了改變,對所有使用了該類型的使用者進行更新卻要付出(比設計時)多得多的工作。

這不是一個簡單的非此及彼的選擇。正確的選擇取決於你希望你的新類型該如何使用。實值型別不具備多態性,但它們在你的應用程式對資料的存取卻是效能有佳;參考型別可以有多態性,並且你還可以在你的應用程式中為它們定義一些表現行為。考慮你期望給你的類型設計什麼樣的職能,並根據這些職能來決定設計什麼樣的類型。結構儲存資料,而類表現行為。

因為很多的常見問題在C++以及Javaj裡存在,因此.Net和C#對實值型別和參考型別的做了區分。在C++裡,所有的參數和傳回值都是以實值型別的進行傳遞的。以實值型別進行傳遞是件很有效率的事,但不得不承受這樣的問題:對象的淺拷貝(partial copying)(有時也稱為slicing object)。如果你對一個派生的對象COPY資料時,是以基類的形式進行COPY的,那麼只有基類的部分資料進行了COPY。你就直接丟失了派生對象的所有資訊。即使時使用基類的虛函數。

而Java語言呢,在放棄了實值型別資料後,或多或少有些表現吧。Javs裡,所有的使用者定義型別都是參考型別,所有的參數及返回資料都是以參考型別進行傳遞的。這一策略在(資料)一致性上有它的優勢,但在效能上卻有缺陷。讓我們面對這樣的情況,有些類型不是多態性的--它們並不須要。Java的程式員們為所有的變數準備了一個記憶體堆分配器和一個最終的記憶體回收行程。他們還須要為每個引用變數的訪問花上額外的時間,因為所有的變數都是參考型別。在C#裡,你或者用struct聲明一個實值型別資料,或者用class聲明一個參考型別資料。實值型別資料應該比較小,是輕量級的。參考型別是從你的類繼承來的。這一節將練慣用不同的方法來使用一個資料類型,以便你給掌握實值型別資料和參考型別資料之間的區別。

我們開始了,這有一個從一個方法上返回的類型:

private MyData _myData;
public MyData Foo()
{
 return _myData;
}
// call it:
MyData v = Foo();
TotalSum += v.Value;

如果MyData是一個實值型別,那麼回返值會被COPY到V中存起來。而且v是在棧記憶體上的。然而,如果MyData是一個參考型別,你就已經把一個引用匯入到了一個內部變數上。同時,
你也違犯了封裝原則(見原則23)。

或者,考慮這個變數:

private MyData _myData;
public MyData Foo()
{
 return _myData.Clone( ) as MyData;
}

// call it:
MyData v = Foo();
TotalSum += v.Value;

現在,v是未經處理資料_myData的一個COPY。做為一個參考型別,兩個對象都是在記憶體堆上建立的。你不會因為暴露內部資料而遇到麻煩。取而代之的是你會在堆上建立了一個額外的資料對象。如果v是局部變數,它很快會成為垃圾,而且Clone要求你在運行時做類型檢測。總而言之,這是低效的。

以公用方法或屬性暴露出去的資料應該是實值型別的。但這並不是說所有從公用成員返回的類型必須是實值型別的。對前面的程式碼片段做一個假設,MyData有資料存在,它的責任就是儲存這些資料。

但是,可以考慮選擇下面的程式碼片段:
private MyType _myType;
public IMyInterface Foo()
{
 return _myType as IMyInterface;
}

// call it:
IMyInterface iMe = Foo();
iMe.DoWork( );

變數_myType還是從Foo方法返回。但這次不同的是,取而代之的是訪問傳回值的內部資料,通過調用一個定義好了的介面上的方法來訪問對象。你正在訪問一個MyType的對象,而不是它的具體資料,只是使用它的行為。該行為是IMyInterface展示給我們的,同時,這個介面是可以被其它很多類型所實現的。做為這個例子,MyType應該是一個參考型別,而不是一個實值型別。MyType的責任是考慮它周圍的行為,而不是它的資料成員。

這段簡單的代碼開始告訴你它們的區別:實值型別儲存資料,參考型別表現行為。現在我們深入的看一下這些類型在記憶體裡是如何儲存的,以及在儲存模型上表現的效能。考慮下面這個類:

public class C
{
  private MyType _a = new MyType( );
  private MyType _b = new MyType( );

  // Remaining implementation removed.
}

C var = new C();

多少個對象被建立了?它們佔用多少記憶體?這還不好說。如果MyType是實值型別,那麼你只做了一次堆記憶體配置。大小正好是MyType大小的2倍。然而,如果MyType是參考型別,那麼你就做了三次堆記憶體配置:一次是為C對象,佔8位元組(假設你用的是32位的指標)(譯註:應該是4位元組,可能是筆誤),另2次是為包含在C對象內的MyType對象分配堆記憶體。之所以有這樣不同的結果是因為實值型別是以內聯的方式存在於一個對象內,相反,參考型別就不是。每一個參考型別只保留一個引用指標,而資料存放區還須要另外的空間。
為了理解這一點,考慮下面這個記憶體配置:

MyType [] var = new MyType[ 100 ];

如果MyType是一個實值型別資料,一次就分配出100個MyType的空間。然而,如果MyType是參考型別,就只有一次記憶體配置。每一個資料元素都是null。當你初始化數組裡的每一個元素時,你要上演101次分配工作--並且這101次記憶體配置比1次分配佔用更多的時間。分配大量的參考型別資料會使堆記憶體出現片段,從而降低程式效能。如果你建立的類型意圖儲存資料的值,那麼實值型別是你要選擇的。

採用實值型別資料還是參考型別資料是一個很重要的決定。把一個實值型別資料轉變為類是一個深層次的改變。考慮下面這種情況:

public struct Employee
{
  private string  _name;
  private int     _ID;
  private decimal _salary;

  // Properties elided

  public void Pay( BankAccount b )
  {
    b.Balance += _salary;
  }
}

這是個很清楚的例子,這個類型包含一個方法,你可以用它為你的僱員付薪水。時間流逝,你的系統也公正的在運行。接著,你決定為不同的僱員分等級了:銷售人員取得擁金,經理取得紅利。你決定把這個Employee類型改為一個類:

public class Employee
{
  private string  _name;
  private int     _ID;
  private decimal _salary;

  // Properties elided

  public virtual void Pay( BankAccount b )
  {
    b.Balance += _salary;
  }
}

這擾亂了很多已經存在並使用了你設計的結構的代碼。傳回值類型的變為返回參考型別。參數也由原來的值傳遞變為現在的引用傳遞。下面程式碼片段的行為將受到重創:

Employee e1 = Employees.Find( "CEO" );
e1.Salary += Bonus; // Add one time bonus.
e1.Pay( CEOBankAccount );

就是這個一次性的在工資中添加紅利的操作,成了持續的提升。曾經是實值型別COPY的地方,如今都變成了參考型別的引用。編譯器很樂意為你做這樣的改變,你的CEO更是樂意這樣的改變。另一方面,你的CEO將會給你報告BUG。
你還是沒能改變對實值型別和參考型別的看法,以至於你犯下這樣的錯誤還不知道:它改變了行為!

出現這個問題的原因就是因為Employee已經不再遵守實值型別資料的的原則。
另外,定義為Empolyee的儲存資料的元素,在這個例子裡你必須為它添加一個職責:為僱員付工資。職責是屬於類範圍內的事。類可以被定義多態的,從而很容易的實現一些常見的職責;而結構則不充許,它應該僅限於儲存資料。

在實值型別和參考型別間做選擇時,.Net的說明文檔建議你把類型的大小做為一個決定因素來考慮。而實際上,更多的因素是類型的使用。簡單的結構或單純的資料載體是實值型別資料優秀的候選對象。事實表明,實值型別資料在記憶體管理上有很好的效能:它們很少會有堆記憶體片段,很少會有垃圾產生,並且很少間接訪問。
(譯註:這裡的垃圾,以及前面提到過的垃圾,是指堆記憶體上“死”掉的對象,使用者無法訪問,只等著由記憶體回收行程來收集的對象,因此認為是垃圾。在.net裡,一般說垃圾時,都是指這些對象。建議看一下.net下記憶體回收行程的管理模型)
更重要是:當從一個方法或者屬性上返回時,實值型別是COPY的資料。這不會有因為暴露內部結構而存在的危險。But you pay in terms of features. 實值型別在物件導向技術上的支援是有限的。你應該把所有的實值型別當成是封閉的。你可以建立一個實現了介面的實值型別,但這須要裝箱,原則17會給你解釋這會帶來效能方面的損失。把實值型別就當成是一個資料的容器吧,不再感覺是OO裡的對象。

你建立的參考型別可能比實值型別要多。如果你對下面所有問題回答YES,你應該建立實值型別資料。把下面的問題與前面的Employee例子做對比:

1、類型的最基本的職責是儲存資料嗎?
2、它的屬性上有定義完整的公用介面來訪問或者修改資料成員嗎?
3、我對類型決不會有子類自信嗎?
4、我對類型決不會有多太性自信嗎?

把實值型別當成一個低層次的資料存放區類型,把應用程式的行為用參考型別來表現。
你會在從類暴露的方法那取得安全資料的COPY。你會從使用內聯的實值型別那裡得到記憶體使用量高率的好處。並且你可以用標準的物件導向技術建立應用程式邏輯。當你對期望的使用拿不準時,使用參考型別。

=================================
小結:這一原則有點長,花的時間也比較多一點,本想下班後,兩三個小時就搞定的,因為我昨天已經翻譯了一些的,結果,還是一不小心搞到了11點。
最後說明一個,這一原則還是沒有說明白什麼是參考型別什麼是實值型別。當然,用class說明的類型一定是參考型別,用struct說明的是實值型別。還要注意其它一些類型的性質:例如:枚舉是什麼類型?委託是什麼類型?事件呢?

Item 6: Distinguish Between Value Types and Reference Types
Value types or reference types? Structs or classes? When should you use each? This isn't C++, in which you define all types as value types and can create references to them. This isn't Java, in which everything is a reference type. You must decide how all instances of your type will behave when you create it. It's an important decision to get right the first time. You must live with the consequences of your decision because changing later can cause quite a bit of code to break in subtle ways. It's a simple matter of choosing the struct or class keyword when you create the type, but it's much more work to update all the clients using your type if you change it later.

It's not as simple as preferring one over the other. The right choice depends on how you expect to use the new type. Value types are not polymorphic. They are better suited to storing the data that your application manipulates. Reference types can be polymorphic and should be used to define the behavior of your application. Consider the expected responsibilities of your new type, and from those responsibilities, decide which type to create. Structs store data. Classes define behavior.

The distinction between value types and reference types was added to .NET and C# because of common problems that occurred in C++ and Java. In C++, all parameters and return values were passed by value. Passing by value is very efficient, but it suffers from one problem: partial copying (sometimes called slicing the object). If you use a derived object where a base object is expected, only the base portion of the object gets copied. You have effectively lost all knowledge that a derived object was ever there. Even calls to virtual functions are sent to the base class version.

The Java language responded by more or less removing value types from the language. All user-defined types are reference types. In the Javalanguage, all parameters and return values are passed by reference. This strategy has the advantage of being consistent, but it's a drain on performance. Let's face it, some types are not polymorphicthey were not intended to be. Java programmers pay a heap allocation and an eventual garbage collection for every variable. They also pay an extra time cost to dereference every variable. All variables are references. In C#, you declare whether a new type should be a value type or a reference type using the struct or class keywords. Value types should be small, lightweight types. Reference types form your class hierarchy. This section examines different uses for a type so that you understand all the distinctions between value types and reference types.

To start, this type is used as the return value from a method:

private MyData _myData;
public MyData Foo()
{
 return _myData;
}

// call it:
MyData v = Foo();
TotalSum += v.Value;

 

If MyData is a value type, the return value gets copied into the storage for v. Furthermore, v is on the stack. However, if MyData is a reference type, you've exported a reference to an internal variable. You've violated the principal of encapsulation (see Item 23).

Or, consider this variant:

private MyData _myData;
public MyData Foo()
{
 return _myData.Clone( ) as MyData;
}

// call it:
MyData v = Foo();
TotalSum += v.Value;

 

Now, v is a copy of the original _myData. As a reference type, two objects are created on the heap. You don't have the problem of exposing internal data. Instead, you've created an extra object on the heap. If v is a local variable, it quickly becomes garbage and Clone forces you to use runtime type checking. All in all, it's inefficient.

Types that are used to export data through public methods and properties should be value types. But that's not to say that every type returned from a public member should be a value type. There was an assumption in the earlier code snippet that MyData stores values. Its responsibility is to store those values.

But, consider this alternative code snippet:

private MyType _myType;
public IMyInterface Foo()
{
 return _myType as IMyInterface;
}

// call it:
IMyInterface iMe = Foo();
iMe.DoWork( );

 

The _myType variable is still returned from the Foo method. But this time, instead of accessing the data inside the returned value, the object is accessed to invoke a method through a defined interface. You're accessing the MyType object not for its data contents, but for its behavior. That behavior is expressed through the IMyInterface, which can be implemented by multiple different types. For this example, MyType should be a reference type, not a value type. MyType's responsibilities revolve around its behavior, not its data members.

That simple code snippet starts to show you the distinction: Value types store values, and reference types define behavior. Now look a little deeper at how those types are stored in memory and the performance considerations related to the storage models. Consider this class:

public class C
{
  private MyType _a = new MyType( );
  private MyType _b = new MyType( );

  // Remaining implementation removed.
}

C var = new C();

 

How many objects are created? How big are they? It depends. If MyType is a value type, you've made one allocation. The size of that allocation is twice the size of MyType. However, if MyType is a reference type, you've made three allocations: one for the C object, which is 8 bytes (assuming 32-bit pointers), and two more for each of the MyType objects that are contained in a C object. The difference results because value types are stored inline in an object, whereas reference types are not. Each variable of a reference type holds a reference, and the storage requires extra allocation.

To drive this point home, consider this allocation:

MyType [] var = new MyType[ 100 ];

 

If MyType is a value type, one allocation of 100 times the size of a MyType object occurs. However, if MyType is a reference type, one allocation just occurred. Every element of the array is null. When you initialize each element in the array, you will have performed 101 allocationsand 101 allocations take more time than 1 allocation. Allocating a large number of reference types fragments the heap and slows you down. If you are creating types that are meant to store data values, value types are the way to go.

The decision to make a value type or a reference type is an important one. It is a far-reaching change to turn a value type into a class type. Consider this type:

public struct Employee
{
  private string  _name;
  private int     _ID;
  private decimal _salary;

  // Properties elided

  public void Pay( BankAccount b )
  {
    b.Balance += _salary;
  }
}

 

This fairly simple type contains one method to let you pay your employees. Time passes, and the system runs fairly well. Then you decide that there are different classes of Employees: Salespeople get commissions, and managers get bonuses. You decide to change the Employee type into a class:

public class Employee
{
  private string  _name;
  private int     _ID;
  private decimal _salary;

  // Properties elided

  public virtual void Pay( BankAccount b )
  {
    b.Balance += _salary;
  }
}

 

That breaks much of the existing code that uses your customer struct. Return by value becomes return by reference. Parameters that were passed by value are now passed by reference. The behavior of this little snippet changed drastically:

Employee e1 = Employees.Find( "CEO" );
e1.Salary += Bonus; // Add one time bonus.
e1.Pay( CEOBankAccount );

 

What was a one-time bump in pay to add a bonus just became a permanent raise. Where a copy by value had been used, a reference is now in place. The compiler happily makes the changes for you. The CEO is probably happy, too. The CFO, on the other hand, will report the bug. You just can't change your mind about value and reference types after the fact: It changes behavior.

This problem occurred because the Employee type no longer follow the guidelines for a value type. In addition to storing the data elements that define an employee, you've added responsibilitiesin this example, paying the employee. Responsibilities are the domain of class types. Classes can define polymorphic implementations of common responsibilities easily; structs cannot and should be limited to storing values.

The documentation for .NET recommends that you consider the size of a type as a determining factor between value types and reference types. In reality, a much better factor is the use of the type. Types that are simple structures or data carriers are excellent candidates for value types. It's true that value types are more efficient in terms of memory management: There is less heap fragmentation, less garbage, and less indirection. More important, value types are copied when they are returned from methods or properties. There is no danger of exposing references to internal structures. But you pay in terms of features. Value types have very limited support for common object-oriented techniques. You cannot create object hierarchies of value types. You should consider all value types as though they were sealed. You can create value types that implement interfaces, but that requires boxing, which Item 17 shows causes performance degradation. Think of value types as storage containers, not objects in the OO sense.

You'll create more reference types than value types. If you answer yes to all these questions, you should create a value type. Compare these to the previous Employee example:

Is this type's principal responsibility data storage?

Is its public interface defined entirely by properties that access or modify its data members?

Am I confident that this type will never have subclasses?

Am I confident that this type will never be treated polymorphically?

Build low-level data storage types as value types. Build the behavior of your application using reference types. You get the safety of copying data that gets exported from your class objects. You get the memory usage benefits that come with stack-based and inline value storage, and you can utilize standard object-oriented techniques to create the logic of your application. When in doubt about the expected use, use a reference type.

 

 

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.