Effective C# 原則11:選擇foreach迴圈

來源:互聯網
上載者:User

Effective C# 原則11:選擇foreach迴圈
Item 11: Prefer foreach Loops

C#的foreach語句是從do,while,或者for迴圈語句變化而來的,它相對要好一些,它可以為你的任何集合產生最好的迭代代碼。它的定義依懶於.Net架構裡的集合介面,並且編譯器會為實際的集合產生最好的代碼。當你在集合上做迭代時,可用使用foreach來取代其它的迴圈結構。檢查下面的三個迴圈:

int [] foo = new int[100];

// Loop 1:
foreach ( int i in foo)
  Console.WriteLine( i.ToString( ));

// Loop 2:
for ( int index = 0;  index < foo.Length;  index++ )
  Console.WriteLine( foo[index].ToString( ));

// Loop 3:
int len = foo.Length;
for ( int index = 0;  index < len;  index++ )
  Console.WriteLine( foo[index].ToString( ));

對於當前的C#編譯器(版本1.1或者更高)而言,迴圈1是最好的。起碼它的輸入要少些,這會使你的個人開發效率提提升。(1.0的C#編譯器對迴圈1而言要慢很多,所以對於那個版本迴圈2是最好的。) 迴圈3,大多數C或者C++程式員會認為它是最有效,但它是最糟糕的。因為在迴圈外部取出了變數Length的值,從而阻礙了JIT編譯器將邊界檢測從迴圈中移出。

C#代碼是安全的Managed 程式碼裡啟動並執行。環境裡的每一塊記憶體,包括資料的索引,都是被監視的。稍微展開一下,迴圈3的代碼實際很像這樣的:

// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;  index < len;  index++ )
{
  if ( index < foo.Length )
    Console.WriteLine( foo[index].ToString( ));
  else
    throw new IndexOutOfRangeException( );
}

C#的JIT編譯器跟你不一樣,它試圖幫你這樣做了。你本想把Length屬性提出到迴圈外面,卻使得編譯做了更多的事情,從而也降低了速度。CLR要保證的內容之一就是:你不能寫出讓變數訪問不屬於它自己記憶體的代碼。在訪問每一個實際的集合時,運行時確保對每個集合的邊界(不是len變數)做了檢測。你把一個邊界檢測分成了兩個。

你還是要為迴圈的每一次迭代做數組做索引檢測,而且是兩次。迴圈1和迴圈2要快一些的原因是因為,C#的JIT編譯器可以驗證數組的邊界來確保安全。任何迴圈變數不是資料的長度時,邊界檢測就會在每一次迭代中發生。(譯註:這裡幾次說到JIT編譯器,它是指將IL代碼編譯成本地代碼時的編譯器,而不是指將C#代碼或者其它代碼編譯成IL代碼時的編譯器。其實我們可以用不安全選項來迫使JIT不做這樣的檢測,從而使運行速度提高。)

原始的C#編譯器之所以對foreach以及數組產生很慢的代碼,是因為涉及到了裝箱。裝箱會在原則17中展開討論。數組是安全的類型,現在的foreach可以為數組產生與其它集合不同的IL代碼。對於數組的這個版本,它不再使用IEnumerator介面,就是這個介面須要裝箱與拆箱。

IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
  int i = (int) it.Current; // box and unbox here.
  Console.WriteLine( i.ToString( ) );
}

取而代之的是,foreach語句為數組產生了這樣的結構:

for ( int index = 0;  index < foo.Length;  index++ )
  Console.WriteLine( foo[index].ToString( ));

(譯註:注意數組與集合的區別。數組是一次性分配的連續記憶體,集合是可以動態添加與修改的,一般用鏈表來實現。而對於C#裡所支援的鋸齒數組,則是一種折衷的處理。)

foreach總能保證最好的代碼。你不用操心哪種結構的迴圈有更高的效率:foreach和編譯器為你代勞了。

如果你並不滿足於高效,例如還要有語言的互動。這個世界上有些人(是的,正是他們在使用其它的程式設計語言)堅定不移的認為數組的索引是從1開始的,而不是0。不管我們如何努力,我們也無法破除他們的這種習慣。.Net開發組已經嘗試過。為此你不得不在C#這樣寫初始化代碼,那就是數組從某個非0數值開始的。

// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });

這段代碼應該足夠讓所有人感到畏懼了(譯註:對我而言,確實有一點)。但有些人就是很頑固,無認你如何努力,他們會從1開始計數。很幸運,這是那些問題當中的一個,而你可以讓編譯器來“欺騙”。用foreach來對test數組進行迭代:
foreach( int j in test )
  Console.WriteLine ( j );

foreach語句知道如何檢測數組的上下限,所以你應該這樣做,而且這和for迴圈的速度是一樣的,也不用管某人是採用那個做為下界。

對於多維陣列,foreach給了你同樣的好處。假設你正在建立一個棋盤。你將會這樣寫兩段代碼:

private Square[,] _theBoard = new Square[ 8, 8 ];

// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
  for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
    _theBoard[ i, j ].PaintSquare( );

取而代之的是,你可以這樣簡單的畫這個棋盤:
foreach( Square sq in _theBoard )
  sq.PaintSquare( );
(譯註:本人不贊成這樣的方法。它隱藏了數組的行與列的邏輯關係。迴圈是以行優先的,如果你要的不是這個順序,那麼這種迴圈並不好。)

foreach語句產生恰當的代碼來迭代數組裡所有維數的資料。如果將來你要建立一個3D的棋盤,foreach迴圈還是一樣的工作,而另一個迴圈則要做這樣的修改:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
  for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
    for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
      _theBoard[ i, j, k ].PaintSquare( );
(譯註:這樣看上去雖然代碼很多,但我覺得,只要是程式員都可以一眼看出這是個三維數組的迴圈,但是對於foreach,我看沒人一眼可以看出來它在做什麼! 個人理解。當然,這要看你怎樣認識,這當然可以說是foreach的一個優點。)

事實上,foreach迴圈還可以在每個維的下限不同的多維陣列上工作(譯註:也就是鋸齒數組)。 我不想寫這樣的代碼,即使是為了做例示。但當某人在某時寫了這樣的集合時,foreach可以勝任。

foreach也給了你很大的伸縮性,當某時你發現須要修改數組裡底層的資料結構時,它可以儘可能多的保證代碼不做修改。我們從一個簡單的數組來討論這個問題:

int [] foo = new int[100];

假設後來某些時候,你發現它不具備數組類(array class)的一些功能,而你又正好要這些功能。你可能簡單把一個數組修改為ArrayList:

// Set the initial size:
ArrayList foo = new ArrayList( 100 );

任何用for迴圈的代碼被破壞:
int sum = 0;
for ( int index = 0;
  // won't compile: ArrayList uses Count, not Length
  index < foo.Length;
  index++ )
  // won't compile: foo[ index ] is object, not int.
  sum += foo[ index ];

然而,foreach迴圈可以根據所操作的對象不同,而自動編譯成不同的代碼來轉化恰當的類型。什麼也不用改。還不只是對標準的數組可以這樣,對於其它任何的集合類型也同樣可以用foreach.

如果你的集合支援.Net環境下的規則,你的使用者就可以用foreach來迭代你的資料類型。為了讓foreach語句認為它是一個集合類型,一個類應該有多數屬性中的一個:公開方法GetEnumerator()的實現可以構成一個集合類。明確的實現IEnumerable介面可以產生一個集合類。實現IEnumerator介面也可以實現一個集合類。foreach可以在任何一個上工作。

foreach有一個好處就是關於資源管理。IEnumerable介面包含一個方法:GetEnumerator()。foreach語句是一個在可枚舉的類型上產生下面的代碼,最佳化過的:
IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
  while ( it.MoveNext( ))
  {
    int elem = ( int ) it.Current;
    sum += elem;
  }
}

如果斷定列舉程式實現了IDisposable介面,編譯器可以自動最佳化代碼為finally塊。但對你而言,明白這一點很重要,無論如何,foreach產生了正確的代碼。

foreach是一個應用廣泛的語句。它為數組的上下限自成正確的代碼,迭代多維陣列,強制轉化為恰當的類型(使用最有效結構),還有,這是最重要的,產生最有效迴圈結構。這是迭代集合最有效方法。這樣,你寫出的代碼更持久(譯註:就是不會因為錯誤而改動太多的代碼),第一次寫代碼的時候更簡潔。這對生產力是一個小的進步,隨著時間的推移會累加起來。

=========================

Item 11: Prefer foreach Loops
The C# foreach statement is more than just a variation of the do, while, or for loops. It generates the best iteration code for any collection you have. Its definition is tied to the collection interfaces in the .NET Framework, and the C# compiler generates the best code for the particular type of collection. When you iterate collections, use foreach instead of other looping constructs. Examine these three loops:

int [] foo = new int[100];

// Loop 1:
foreach ( int i in foo)
  Console.WriteLine( i.ToString( ));

// Loop 2:
for ( int index = 0;
  index < foo.Length;
  index++ )
  Console.WriteLine( foo[index].ToString( ));

// Loop 3:
int len = foo.Length;
for ( int index = 0;
  index < len;
  index++ )
  Console.WriteLine( foo[index].ToString( ));

 

For the current and future C# compilers (version 1.1 and up), loop 1 is best. It's even less typing, so your personal productivity goes up. (The C# 1.0 compiler produced much slower code for loop 1, so loop 2 is best in that version.) Loop 3, the construct most C and C++ programmers would view as most efficient, is the worst option. By hoisting the Length variable out of the loop, you make a change that hinders the JIT compiler's chance to remove range checking inside the loop.

C# code runs in a safe, managed environment. Every memory location is checked, including array indexes. Taking a few liberties, the actual code for loop 3 is something like this:

// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;
  index < len;
  index++ )
{
  if ( index < foo.Length )
    Console.WriteLine( foo[index].ToString( ));
  else
    throw new IndexOutOfRangeException( );
}

 

The JIT C# compiler just doesn't like you trying to help it this way. Your attempt to hoist the Length property access out of the loop just made the JIT compiler do more work to generate even slower code. One of the CLR guarantees is that you cannot write code that overruns the memory that your variables own. The runtime generates a test of the actual array bounds (not your len variable) before accessing each particular array element. You get one bounds check for the price of two.

You still pay to check the array index on every iteration of the loop, and you do so twice. The reason loops 1 and 2 are faster is that the C# compiler and the JIT compiler can verify that the bounds of the loop are guaranteed to be safe. Anytime the loop variable is not the length of the array, the bounds check is performed on each iteration.

The reason that foreach and arrays generated very slow code in the original C# compiler concerns boxing, which is covered extensively in Item 17. Arrays are type safe. foreach now generates different IL for arrays than other collections. The array version does not use the IEnumerator interface, which would require boxing and unboxing:

IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
  int i = (int) it.Current; // box and unbox here.
  Console.WriteLine( i.ToString( ) );
}

 

Instead, the foreach statement generates this construct for arrays:

for ( int index = 0;
  index < foo.Length;
  index++ )
  Console.WriteLine( foo[index].ToString( ));

 

foreach always generates the best code. You don't need to remember which construct generates the most efficient looping construct: foreach and the compiler will do it for you.

If efficiency isn't enough for you, consider language interop. Some folks in the world (yes, most of them use other programming languages) strongly believe that index variables start at 1, not 0. No matter how much we try, we won't break them of this habit. The .NET team tried. You have to write this kind of initialization in C# to get an array that starts at something other than 0:

// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });

 

This code should be enough to make anybody cringe and just write arrays that start at 0. But some people are stubborn. Try as you might, they will start counting at 1. Luckily, this is one of those problems that you can foist off on the compiler. Iterate the test array using foreach:

foreach( int j in test )
  Console.WriteLine ( j );

 

The foreach statement knows how to check the upper and lower bounds on the array, so you don't have toand it's just as fast as a hand-coded for loop, no matter what different lower bound someone decides to use.

foreach adds other language benefits for you. The loop variable is read-only: You can't replace the objects in a collection using foreach. Also, there is explicit casting to the correct type. If the collection contains the wrong type of objects, the iteration throws an exception.

foreach gives you similar benefits for multidimensional arrays. Suppose that you are creating a chess board. You would write these two fragments:

private Square[,] _theBoard = new Square[ 8, 8 ];

// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
  for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
    _theBoard[ i, j ].PaintSquare( );

 

Instead, you can simplify painting the board this way:

foreach( Square sq in _theBoard )
  sq.PaintSquare( );

 

The foreach statement generates the proper code to iterate across all dimensions in the array. If you make a 3D chessboard in the future, the foreach loop just works. The other loop needs modification:

for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
  for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
    for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
      _theBoard[ i, j, k ].PaintSquare( );

 

In fact, the foreach loop would work on a multidimensional array that had different lower bounds in each direction. I don't want to write that kind of code, even as an example. But when someone else codes that kind of collection, foreach can handle it.

foreach also gives you the flexibility to keep much of the code intact if you find later that you need to change the underlying data structure from an array. We started this discussion with a simple array:

int [] foo = new int[100];

 

Suppose that, at some later point, you realize that you need capabilities that are not easily handled by the array class. You can simply change the array to an ArrayList:

// Set the initial size:
ArrayList foo = new ArrayList( 100 );

 

Any hand-coded for loops are broken:

int sum = 0;
for ( int index = 0;
  // won't compile: ArrayList uses Count, not Length
  index < foo.Length;
  index++ )
  // won't compile: foo[ index ] is object, not int.
  sum += foo[ index ];

 

However, the foreach loop compiles to different code that automatically casts each operand to the proper type. No changes are needed. It's not just changing to standard collections classes, eitherany collection type can be used with foreach.

Users of your types can use foreach to iterate across members if you support the .NET environment's rules for a collection. For the foreach statement to consider it a collection type, a class must have one of a number of properties. The presence of a public GetEnumerator() method makes a collection class. Explicitly implementing the IEnumerable interface creates a collection type. Implementing the IEnumerator interface creates a collection type. foreach works with any of them.

foreach has one added benefit regarding resource management. The IEnumerable interface contains one method: GetEnumerator(). The foreach statement on an enumerable type generates the following, with some optimizations:

IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
  while ( it.MoveNext( ))
  {
    int elem = ( int ) it.Current;
    sum += elem;
  }
}

 

The compiler automatically optimizes the code in the finally clause if it can determine for certain whether the enumerator implements IDisposable. But for you, it's more important to see that, no matter what, foreach generates correct code.

foreach is a very versatile statement. It generates the right code for upper and lower bounds in arrays, iterates multidimensional arrays, coerces the operands into the proper type (using the most efficient construct), and, on top of that, generates the most efficient looping constructs. It's the best way to iterate collections. With it, you'll create code that is more likely to last, and it's simpler for you to write in the first place. It's a small productivity improvement, but it adds up over time.
 

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.