Effective C# 原則31:選擇小而簡單的函數(譯)

來源:互聯網
上載者:User

Effective C# 原則31:選擇小而簡單的函數
Item 31: Prefer Small, Simple Functions

做為一個有經驗的程式員,不管你在使用C#以前是習慣用什麼語言的,我們綜合了幾個可以讓你開發出有效代碼的實際方法。有些時候,我們在先前的環境中所做的努力在.Net環境中卻成了相反的。特別是在你試圖手動去最佳化一些代碼時尤其突出。你的這些行為往往會阻止JIT編譯器進行最有效最佳化。你的以效能為由的額外工作,實際上產生了更慢的代碼。你最好還是以你最清楚的方法寫代碼,其它的讓JIT編譯器來做。最常見的一個例子就是預先最佳化,你建立一個很長很複雜的函數,本想用它來避免太多的函數調用,結果會導致很多問題。實際操作時,提升這樣一個函數的邏輯到迴圈體中對.Net程式是有害的。這與你的真實是相反的,讓我們來看一些細節。

這一節介紹一個簡單的內容,那就是JIT編譯器是如何工作的 。.Net運行時調用JIT編譯器,用來把由C#編譯器產生的IL指令編譯成機器代碼。這一任務在應用程式的運行期間是分步進行的。JIT並不是在程式一開始就編譯整個應用程式,取而代之的是,CLR是一個函數接一個函數的調用JIT編譯器。這可以讓啟動開銷最小化到合理的層級,然而不合理的是應用程式保留了大量的代碼要在後期進行編譯。那些從來不被調用的函數JIT是不會編譯它的。你可以通過讓JIT把代碼分解成更多的小塊,從而來最小化大量無關的代碼,也就是說小而多的函數比大而少的函數要好。考慮這個人為的例子:

public string BuildMsg( bool takeFirstPath )
{
  StringBuilder msg = new StringBuilder( );
  if ( takeFirstPath )
  {
    msg.Append( "A problem occurred." );
    msg.Append( "\nThis is a problem." );
    msg.Append( "imagine much more text" );
  } else
  {
    msg.Append( "This path is not so bad." );
    msg.Append( "\nIt is only a minor inconvenience." );
    msg.Append( "Add more detailed diagnostics here." );
  }
  return msg.ToString( );
}

在BuildMsg第一次調用時,兩個選擇項就都編譯了。而實際上只有一個是須要的。但是假設你這樣寫代碼:

public string BuildMsg( bool takeFirstPath )
{
  if ( takeFirstPath )
  {
    return FirstPath( );
  } else
  {
    return SecondPath( );
  }
}

因為函數體的每個分支被分解到了獨立的小函數中,而JIT就是須要這些小函數,這比前面的BuildMsg調用要好。確實,這個例子只是人為的,而且實際上它也沒什麼太特別的。但想想,你是不是經常寫更“昂貴”的例子呢:一個if 語句中是不是每個片段中都包含了20或者更多的語句呢?你的開銷就是讓JIT在第一次調用它的時候兩個分支都要編譯。如果一個分支不像是錯誤條件,那到你就招致了本可以簡單避免的浪費。小函數就意味著JIT編譯器只編譯它要的邏輯,而不是那些沉長的而且又不會立即使用的代碼。對於很長的switch分支,JIT要花銷成倍的儲存,因此把每個分支的內容定義成內聯的要比分離成單個函數要好。

JIT編譯器可以更簡單的對小而簡單的函數進行可登記(enregistration)處理。可登記處理是指進程選擇哪些局部變數可以被儲存到寄存器中,而這比儲存到堆棧中要好。建立少的局部變數可以能JIT提供更好的機會把最合適的候選對象放到寄存器中。這個簡單的控制流程程同樣會影響JIT編譯能否如期的進行變數註冊。如果函數只有一個迴圈,那麼迴圈變數就很可能被註冊。然而,當你在一個函數中使用過多的迴圈時,對於變數註冊,JIT編譯器就不得不做出一些困難的決擇。簡單就是好,小而簡單的函數很可能只包含簡單幾個變數,這樣可以讓JIT很容易最佳化寄存器的使用。

JIT編譯器同樣決定內聯方法。內聯就是說直接使用函數體而不必調用函數。考慮這個例子:

// readonly name property:
private string _name;
public string Name
{
  get
  {
    return _name;
  }
}

// access:
string val = Obj.Name;

相對函數的調用開銷來說,屬性訪問器實體包含更少數的指令:對於函數調用,要先在寄存器中儲存它的狀態,然後從頭到尾執行,接著儲存返回結果。這還不談如果有參數時,把參數壓到堆棧上還要更多的工作。如果你這樣寫,這會產生更多的機器指令:

string val = Obj._name;

當然,你應該不會這樣做,因為你已經明白最好不要建立公用資料成員(參見原則1)。JIT編譯器明白你即須要效率也須要簡潔,所以它會內聯屬性訪問器。JIT會在以速度或者大小為目標(或者兩個同時要求)時,內聯一些方法,用函數體來取代函數的調用會讓它更有利。一般情況不用為內聯定義額外的規則,而且任何已經實現的內聯在將來都可能被改變。另外,內嵌函式並不是你的職責。正好C#語言沒有提供任何關鍵字讓你暗示編譯器說你想內聯某個函數。實際上,C#編譯器也不支援任何暗示來讓JIT編譯進行內聯。你可以做的就是確保你的代碼儘可能的清楚,儘可能讓JIT編譯器容易的做出最好的決定。我的推薦現在就很熟悉了:越小的方法越有可能成為內聯對象。請記住:任何虛方法或者含有try/catch塊的函數都不可能成為內聯的。

內聯修改了代碼正要被JIT的原則。再來考慮這個訪問名字屬性的例子:

string val = "Default Name";
if ( Obj != null )
  val = Obj.Name;

JIT編譯器內聯了屬性訪問器,這必然會在相關的方法被調用時JIT代碼。

你沒有責任來為你的演算法決定最好的機器層級上的表現。C#編譯器以及JIT編譯器一起為你完成了這些。C#編譯器為每個方法產生IL代碼,而JIT編譯器則把這些IL代碼在目標機器上翻譯成機器指令。並不用太在意JIT編譯器在各種情況下的確切原則;有這些時間可以開發出更好的演算法。取而代之的,你應該考慮如何以一種好的方式表達你的演算法,這樣的方式可以讓開發環境的工具以最好的方式工作。幸運的是,這些你所考慮的這些原則(譯註:JIT工作原則)已經成為優秀的軟體開發實踐。再強調一次:使用小而簡單的函數。

記住,你的C#代碼經過了兩步才編譯成機器可執行檔指令。C#編譯器產生以程式集形式存在的IL代碼。而JIT編譯器則是在須要時,以每個函數為單元產生機器指令(當內聯調用時,或者是一組方法)。小函數可以讓它非常容易被JIT編譯器分期處理。小函數更有可能成為內聯候選對象。當然並不是足夠小才行:簡單的控制流程程也是很重要的。函數內簡單的控制分支可以讓JIT以容易的寄存變數。這並不是只是寫清晰代碼的事情,也是告訴你如何建立在運行時更有效代碼。

================================
   

Item 31: Prefer Small, Simple Functions
As experienced programmers, in whatever language we favored before C#, we internalized several practices for developing more efficient code. Sometimes what worked in our previous environment is counterproductive in the .NET environment. This is very true when you try to hand-optimize algorithms for the C# compiler. Your actions often prevent the JIT compiler from more effective optimizations. Your extra work, in the name of performance, actually generates slower code. You're better off writing the clearest code you can create. Let the JIT compiler do the rest. One of the most common examples of premature optimizations causing problems is when you create longer, more complicated functions in the hopes of avoiding function calls. Practices such as hoisting function logic into the bodies of loops actually harm the performance of your .NET applications. It's counterintuitive, so let's go over all the details.

This chapter's introduction contains a simplified discussion of how the JIT compiler performs its work. The .NET runtime invokes the JIT compiler to translate the IL generated by the C# compiler into machine code. This task is amortized across the lifetime of your program's execution. Instead of JITing your entire application when it starts, the CLR invokes the JITer on a function-by-function basis. This minimizes the startup cost to a reasonable level, yet keeps the application from becoming unresponsive later when more code needs to be JITed. Functions that do not ever get called do not get JITed. You can minimize the amount of extraneous code that gets JITed by factoring code into more, smaller functions rather than fewer larger functions. Consider this rather contrived example:

public string BuildMsg( bool takeFirstPath )
{
  StringBuilder msg = new StringBuilder( );
  if ( takeFirstPath )
  {
    msg.Append( "A problem occurred." );
    msg.Append( "\nThis is a problem." );
    msg.Append( "imagine much more text" );
  } else
  {
    msg.Append( "This path is not so bad." );
    msg.Append( "\nIt is only a minor inconvenience." );
    msg.Append( "Add more detailed diagnostics here." );
  }
  return msg.ToString( );
}

 

The first time BuildMsg gets called, both paths are JITed. Only one is needed. But suppose you rewrote the function this way:

public string BuildMsg( bool takeFirstPath )
{
  if ( takeFirstPath )
  {
    return FirstPath( );
  } else
  {
    return SecondPath( );
  }
}

 

Because the body of each clause has been factored into its own function, that function can be JITed on demand rather than the first time BuildMsg is called. Yes, this example is contrived for space, and it won't make much difference. But consider how often you write more extensive examples: an if statement with 20 or more statements in both branches of the if statement. You'll pay to JIT both clauses the first time the function is entered. If one clause is an unlikely error condition, you'll incur a cost that you could easily avoid. Smaller functions mean that the JIT compiler compiles the logic that's needed, not lengthy sequences of code that won't be used immediately. The JIT cost savings multiplies for long switch statements, with the body of each case statement defined inline rather than in separate functions.

Smaller and simpler functions make it easier for the JIT compiler to support enregistration. Enregistration is the process of selecting which local variables can be stored in registers rather than on the stack. Creating fewer local variables gives the JIT compiler a better chance to find the best candidates for enregistration. The simplicity of the control flow also affects how well the JIT compiler can enregister variables. If a function has one loop, that loop variable will likely be enregistered. However, the JIT compiler must make some tough choices about enregistering loop variables when you create a function with several loops. Simpler is better. A smaller function is more likely to contain fewer local variables and make it easier for the JIT compiler to optimize the use of the registers.

The JIT compiler also makes decisions about inlining methods. Inlining means to substitute the body of a function for the function call. Consider this example:

// readonly name property:
private string _name;
public string Name
{
  get
  {
    return _name;
  }
}

// access:
string val = Obj.Name;

 

The body of the property accessor contains fewer instructions than the code necessary to call the function: saving register states, executing method prologue and epilogue code, and storing the function return value. There would be even more work if arguments needed to be pushed on the stack as well. There would be far fewer machine instructions if you were to write this:

string val = Obj._name;

 

Of course, you would never do that because you know better than to create public data members (see Item 1). The JIT compiler understands your need for both efficiency and elegance, so it inlines the property accessor. The JIT compiler inlines methods when the speed or size benefits (or both) make it advantageous to replace a function call with the body of the called function. The standard does not define the exact rules for inlining, and any implementation could change in the future. Moreover, it's not your responsibility to inline functions. The C# language does not even provide you with a keyword to give a hint to the compiler that a method should be inlined. In fact, the C# compiler does not provide any hints to the JIT compiler regarding inlining. All you can do is ensure that your code is as clear as possible, to make it easier for the JIT compiler to make the best decision possible. The recommendation should be getting familiar by now: Smaller methods are better candidates for inlining. But remember that even small functions that are virtual or that contain try/catch blocks cannot be inlined.

Inlining modifies the principle that code gets JITed when it will be executed. Consider accessing the name property again:

string val = "Default Name";
if ( Obj != null )
  val = Obj.Name;

 

If the JIT compiler inlines the property accessor, it must JIT that code when the containing method is called.

It's not your responsibility to determine the best machine-level representation of your algorithms. The C# compiler and the JIT compiler together do that for you. The C# compiler generates the IL for each method, and the JIT compiler translates that IL into machine code on the destination machine. You should not be too concerned about the exact rules the JIT compiler uses in all cases; those will change over time as better algorithms are developed. Instead, you should be concerned about expressing your algorithms in a manner that makes it easiest for the tools in the environment to do the best job they can. Luckily, those rules are consistent with the rules you already follow for good software-development practices. One more time: smaller and simpler functions

Remember that translating your C# code into machine-executable code is a two-step process. The C# compiler generates IL that gets delivered in assemblies. The JIT compiler generates machine code for each method (or group of methods, when inlining is involved), as needed. Small functions make it much easier for the JIT compiler to amortize that cost. Small functions are also more likely to be candidates for inlining. It's not just smallness: Simpler control flow matters just as much. Fewer control branches inside functions make it easier for the JIT compiler to enregister variables. It's not just good practice to write clearer code; it's how you create more efficient code at runtime.
 
   

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.