Effective C# 原則34:建立大容量的Web API(譯)

來源:互聯網
上載者:User

Effective C# 原則34:建立大容量的Web API
Item 34: Create Large-Grain Web APIs

互動協議的開銷與麻煩就是對資料媒體的如何使用。在互動過程中可能要不同的使用媒體,例如在交流中要不同的使用電話號碼,傳真,地址,和電子郵件地址。讓我們再回頭來看看上次的訂購目錄,當你用電話訂購時,你要回答售貨員的一系列問題:

“你可以把第一項填一下嗎?”

“這一項的號碼是123-456”

"您想訂購多少呢?"

"三件"

這樣的問題一直要問到銷售人員填寫完所有的資訊為止,例如還要知道你的訂購地址,信用卡資訊,運送地址,以及其它一些必須的資訊來完成這比交易。在電話上完成這樣一來一回的討論還是令人鼓舞的。因為你不會是一個人長時間的自言自語,而且你也不會長時間忍受銷售人員是否還要哪裡的安靜狀態。

與傳真訂購相比,你要填寫整個訂購文檔,然後把整個文檔發給公司。一個檔案一次性傳輸完成,你不用很填寫產品編號,發傳真,然後填寫地址,然後再傳真,填寫信用卡號,然後再發傳真。

這裡示範了一個定義糟糕的web方法介面會遇到的常見缺陷。當你使用web服務,或者.Net遠程互動時,你必須記住:最昂貴的開銷是在兩台遠程機器之間進行對象傳輸時出現。你不應該只是通過重新封裝一下原來在本機電腦上使用的介面來建立遠程API。雖然這樣是可以工作的,但效率是很低的。
這就有點類似是用電話的方式來完成用傳真訂購的任務。你的應用程式大部份時間都在每次向通道上發送一段資料後等待網路。使用越是小塊的API,應用程式在等待伺服器資料返回的時間應用比就更高。

相反,我們在建立基於web的介面時,應該把伺服器與用戶端的一系列對象進行序列化,然後基於這個序列化後的文檔進行傳輸。你的遠程交流應該像用傳真訂購時使用的表單一樣:用戶端應該有一個不與伺服器進行通訊的擴充已耗用時間段。這時,當所用的資訊已經填寫完成時,使用者就可以一次性的提交這個文檔到伺服器上。伺服器上還是做同樣的事情:當伺服器上返回到客戶上的資訊到達時,客戶的手頭上就得到了完成訂購任務必須的所有資訊。

比喻說我們要粘貼一個客戶訂單,我們要設計一個客戶的訂購處理系統,而且它要與中心伺服器和案頭使用者通過網路訪問資訊保持一致。系統其中的一個類就是客戶類。如果你忽略傳輸問題,那麼客戶類可能會像這樣設計,這充許使用者取回或者修改姓名,運輸地址,以及帳號資訊:

public class Customer
{
  public Customer( )
  {
  }

  // Properties to access and modify customer fields:
  public string Name
  {
    // get and set details elided.
  }

  public Address shippingAddr
  {
    // get and set details elided.
  }

  public Account creditCardInfo
  {
    // get and set details elided.
  }
}

這個客戶類不包含遠程調用的API,在伺服器和客戶之間調用一個遠端使用者會產生嚴重的交通阻塞:

// create customer on the server.
Customer c = new Server.Customer( );
// round trip to set the name.
c.Name = dlg.Name.Text;
// round trip to set the addr.
c.shippingAddr = dlg.Addr;
// round trip to set the cc card.
c.creditCardInfo = dlg.credit;

相反,你應該在本機建立一個完整的客戶對象,然後等使用者填寫完所有的資訊後,再輸送這個客戶對象到伺服器:

// create customer on the client.
Customer c = new Customer( );
// Set local copy
c.Name = dlg.Name.Text;
// set the local addr.
c.shippingAddr = dlg.Addr;
// set the local cc card.
c.creditCardInfo = dlg.credit;
// send the finished object to the server. (one trip)
Server.AddCustomer( c );

這個客戶的例子清楚簡單的示範了這個問題:在伺服器與用戶端之間一來一回的傳輸整個對象。但為了寫出高效的代碼,你應該擴充這個簡單的例子,應該讓它包含正確的相關對象集合。在遠程請求中,使用對象的單個屬性就是使用太小的粒子(譯註:這裡的粒子就是指一次互動時所包含的資訊量)。但,對於每次在伺服器與客戶之間傳輸來說,一個客戶執行個體可能不是大小完全正確的粒子。

讓我們來再擴充一下這個例子,讓它更接近現實設計中會遇到的一些問題,我們再對系統做一些假設。這個軟體主要支援一個擁有1百萬客戶的線上賣主。假設每個使用者有一個訂購房子的主要目錄,平均一點,去年有15個訂單。
每個電話接線員使用一台機器輪班操作,而且不管電話訂單者是否回答電話,他們都要尋找或者建立這條訂單記錄。你的設計任務是決定大多數在客戶和伺服器之間傳輸的高效對象集合。

你一開始可能消除一些顯而易見的選擇,例如取回每一個客戶以及每次的訂單資訊是應該明確禁止的:1百萬客戶以及15百萬(1千5百萬)訂單記錄顯然是太大了而不應該反回到做一個客戶那裡去。這樣很容易在另一個使用者上遇到瓶頸問題。在每次可能要更新資料時,都會給伺服器施加轟炸式打擊,你要發送一個包含15百萬對象的請求。當然,這隻是一次事務,但它確實太低效了。

相反,考慮如何可以最好的取回一個對象的集合,你可以建立一個好的資料集合代理,處理一些在後來幾分鐘一定會使用的對象。一個接線員回複一個電話,而且可能對某個客戶有興趣。在電話交談的過程中,接線員可能添加或者移除訂單,修改訂單,或者修改一個客戶的帳號資訊。明顯的選擇就是取回一個客戶,以及這個使用者的所有訂單。伺服器上的方法可能會是這樣的:

public OrderData FindOrders( string customerName )
{
  // Search for the customer by name.
  // Find all orders by that customer.
}

對的嗎?傳送到客戶而且客戶已經接收到的訂單很可能在客戶機上是不須要的。一個更好的做法就是為每個請求的使用者只取回一條訂單。伺服器的方法可能修改成這個樣子:

public OrderData FindOpenOrders( string customerName )
{
  // Search for the customer by name.
  // Find all orders by that customer.
  // Filter out those that have already
  // been received.
}

這樣你還是要讓客戶機為每個電話訂單建立一個新的請求。有一個方法來最佳化通訊嗎?比下載使用者包含的所有訂單更好的方法。我們會在業務處理中添加一些新的假設,從而給你一些方法。假設話務中心是分布的,這樣每個工作群組收到的電話具有不同的區號。現在你就可以修改你的設計了,從而對互動進行一個不小的最佳化。

每個地區的接線員可能在一開始輪班時,就取回並且更新客戶以及訂單資訊。在每次電話後,客戶應用程式應該把修改後的資料返回到服務上,而且伺服器應該響應上次客戶請求資料以後的所有修改。結果就是,在每次電話後,接線員發送所有的修改,這些修改包含這個組中其它接線員所做的所有修改。這樣的設計就是說,每一個電話只有一次會話,而且每一個接線員應該在每次回複電話時,手裡有資料集合訪問權。這樣伺服器上可能就有兩個這樣的方法:

public CustomerSet RetrieveCustomerData(
  AreaCode theAreaCode )
{
  // Find all customers for a given area code.
  // Foreach customer in that area code:
    // Find all orders by that customer.
    // Filter out those that have already
    // been received.
  // Return the result.
}

public CustomerSet UpdateCustomer( CustomerData
  updates, DataTime lastUpdate, AreaCode theAreaCode )
{
  // First, save any updates, marking each update
  // with the current time.

  // Next, get the updates:
  // Find all customers for a given area code.
  // Foreach customer in that area code:
    // Find all orders by that customer that have been
    // updated since the last time. Add those to the result.
  // Return the result.
}

但這樣可能還是要浪費一些頻寬。當每個已知客戶每天都有電話時,最後一個設計是最有效。但這很可能是不對的。如果是的,那麼你的公司應該在客戶服務上存在很大的問題,而這個問題應該用軟體是無法解決的。

如何更進一步限制傳輸大小呢,要求不增加會話次數和及伺服器的響應延時?你可以對資料庫裡的一些準備打電話的客戶進行一些假設。你可以跟蹤一些統計表,然後可以發現,如果一些客戶已經有6個月沒有訂單了,那麼他們很可能就不會再有訂單了。這時你就應該在那一天的一開始就停止取回這些客戶以及他們的訂單。這可以收縮傳輸的初始大小,你同樣可以發現,很多客戶在通過一個簡短電話下了訂單過後,經常會再打電話來詢問上次訂單的事。因此,你可以修改訂單列表,只傳輸最後的一些訂單而不是所有的訂單。這可能不用修改伺服器上的方法簽名,但這會收縮傳輸給客戶上的包的大小。

這些假設的討論焦點是要給你一些關於遠程互動的想法:你減少兩機器間的會話頻率和工作階段時數據包的大小。這兩個目標是矛盾的,你要在這兩者中做一個平衡的選擇。你應該取兩個極端的中點,而不是錯誤的選擇過大,或者過小的會話。
============================
   

Item 34: Create Large-Grain Web APIs
The cost and inconvenience of a communication protocol dictates how you should use the medium. You communicate differently using the phone, fax, letters, and email. Think back on the last time you ordered from a catalog. When you order by phone, you engage in a question-and-answer session with the sales staff:

"Can I have your first item?"

"Item number 123-456."

"How many would you like?"

"Three."

This conversation continues until the sales staff has your entire order, your billing address, your credit-card information, your shipping address, and any other information necessary to complete the transaction. It's comforting on the phone to have this back-and-forth discussion. You never give long soliloquies with no feedback. You never endure long periods of silence wondering if the salesperson is still there.

Contrast that with ordering by fax. You fill out the entire document and fax the completed document to the company. One document, one transaction. You do not fill out one product line, fax it, add your address, fax again, add your credit number, and fax again.

This illustrates the common pitfalls of a poorly defined web method interface. Whether you use a web service or .NET Remoting,you must remember that the most expensive part of the operation comes when you transfer objects between distant machines. You must stop creating remote APIs that are simply a repackaging of the same local interfaces that you use. It works, but it reeks of inefficiency. It's using the phone call metaphor to process your catalog request via fax. Your application waits for the network each time you make a round trip to pass a new piece of information through the pipe. The more granular the API is, the higher percentage of time your application spends waiting for data to return from the server.

Instead, create web-based interfaces based on serializing documents or sets of objects between client and server. Your remote communications should work like the order form you fax to the catalog company: The client machine should be capable of working for extended periods of time without contacting the server. Then, when all the information to complete the transaction is filled in, the client can send the entire document to the server. The server's responses work the same way: When information gets sent from the server to the client, the client receives all the information necessary to complete all the tasks at hand.

Sticking with the customer order metaphor, we'll design a customer order-processing system that consists of a central server and desktop clients accessing information via web services. One class in the system is the customer class. If you ignore the transport issues, the customer class might look something like this, which allows client code to retrieve or modify the name, shipping address, and account information:

public class Customer
{
  public Customer( )
  {
  }

  // Properties to access and modify customer fields:
  public string Name
  {
    // get and set details elided.
  }

  public Address shippingAddr
  {
    // get and set details elided.
  }

  public Account creditCardInfo
  {
    // get and set details elided.
  }
}

 

The customer class does not contain the kind of API that should be called remotely. Calling a remote customer results in excessive traffic between the client and the server:

// create customer on the server.
Customer c = new Server.Customer( );
// round trip to set the name.
c.Name = dlg.Name.Text;
// round trip to set the addr.
c.shippingAddr = dlg.Addr;
// round trip to set the cc card.
c.creditCardInfo = dlg.credit;

 

Instead, you would create a local Customer object and transfer the Customer to the server after all the fields have been set:

// create customer on the client.
Customer c = new Customer( );
// Set local copy
c.Name = dlg.Name.Text;
// set the local addr.
c.shippingAddr = dlg.Addr;
// set the local cc card.
c.creditCardInfo = dlg.credit;
// send the finished object to the server. (one trip)
Server.AddCustomer( c );

 

The customer example illustrates an obvious and simple example: transfer entire objects back and forth between client and server. But to write efficient programs, you need to extend that simple example to include the right set of related objects. Making remote invocations to set a single property of an object is too small of a granularity. But one customer might not be the right granularity for transactions between the client and server, either.

To extend this example into the real-world design issues you'll encounter in your programs, we'll make a few assumptions about the system. This software system supports a major online vendor with more than 1 million customers. Imagine that it is a major catalog ordering house and that each customer has, on average, 15 orders in the last year. Each telephone operator uses one machine during the shift and must lookup or create customer records whenever he or she answers the phone. Your design task is to determine the most efficient set of objects to transfer between client machines and the server.

You can begin by eliminating some obvious choices. Retrieving every customer and every order is clearly prohibitive: 1 million customers and 15 million order records are just too much data to bring to each client. You've simply traded one bottleneck for another. Instead of constantly bombarding your server with every possible data update, you send the server a request for more than 15 million objects. Sure, it's only one transaction, but it's a very inefficient transaction.

Instead, consider how you can best retrieve a set of objects that can constitute a good approximation of the set of data that an operator must use for the next several minutes. An operator will answer the phone and be interacting with one customer. During the course of the phone call, that operator might add or remove orders, change orders, or modify a customer's account information. The obvious choice is to retrieve one customer, with all orders that have been placed by that customer. The server method would be something like this:

public OrderData FindOrders( string customerName )
{
  // Search for the customer by name.
  // Find all orders by that customer.
}

 

Or is that right? Orders that have been shipped and received by the customer are almost certainly not needed at the client machine. A better answer is to retrieve only the open orders for the requested customer. The server method would change to something like this:

public OrderData FindOpenOrders( string customerName )
{
  // Search for the customer by name.
  // Find all orders by that customer.
  // Filter out those that have already
  // been received.
}

 

You are still making the client machine create a new request for each customer phone call. Are there ways to optimize this communication channel more than including orders in the customer download? We'll add a few more assumptions on the business processes to give you some more ideas. Suppose that the call center is partitioned so that each working team receives calls from only one area code. Now you can modify your design to optimize the communication quite a bit more.

Each operator would retrieve the updated customer and order information for that one area code at the start of the shift. After each call, the client application would push the modified data back to the server, and the server would respond with all changes since the last time this client machine asked for data. The end result is that after every phone call, the operator sends any changes made and retrieves all changes made by any other operator in the same work group. This design means that there is one transaction per phone call, and each operator should always have the right set of data available when he or she answers a call. Now the server contains two methods that would look something like this:

public CustomerSet RetrieveCustomerData(
  AreaCode theAreaCode )
{
  // Find all customers for a given area code.
  // Foreach customer in that area code:
    // Find all orders by that customer.
    // Filter out those that have already
    // been received.
  // Return the result.
}

public CustomerSet UpdateCustomer( CustomerData
  updates, DataTime lastUpdate, AreaCode theAreaCode )
{
  // First, save any updates, marking each update
  // with the current time.

  // Next, get the updates:
  // Find all customers for a given area code.
  // Foreach customer in that area code:
    // Find all orders by that customer that have been
    // updated since the last time. Add those to the result.
  // Return the result.
}

 

But you might still be wasting some bandwidth. Your last design works best when every known customer calls every day. That's probably not true. If it is, your company has customer service problems that are far outside of the scope of a software program.

How can we further limit the size of each transaction without increasing the number of transactions or the latency of the service rep's responsiveness to a customer? You can make some assumptions about which customers in the database are going to place calls. You track some statistics and find that if customers go six months without ordering, they are very unlikely to order again. So you stop retrieving those customers and their orders at the beginning of the day. That shrinks the size of the initial transaction. You also find that any customer who calls shortly after placing an order is usually inquiring about the last order. So you modify the list of orders sent down to the client to include only the last order rather than all orders. This would not change the signatures of the server methods, but it would shrink the size of the packets sent back to the client.

This hypothetical discussion focused on getting you to think about the communication between remote machines: You want to minimize both the frequency and the size of the transactions sent between machines. Those two goals are at odds, and you need to make trade-offs between them. You should end up close to the center of the two extremes, but err toward the side of fewer, larger transactions.
 
   

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.