AuthorAndreas GrabnerTranslatorZhu yongguang: http://www.infoq.com/cn/articles/SharePoint-Andreas-Grabner
The SharePoint object model allows external applications or hosted webparts to query, edit, and create content stored in the SharePoint content database. Many blog articles, knowledge base articles, and best practices all talk about how to use object models in different use cases.
Most common use cases are about displaying and editing the Sharepoint list-unfortunately, this is where we see many performance problems, this is because the SharePoint object model is not always used in performance optimization mode.
Case 1: How many items are stored in the Sharepoint list?
There are multiple ways to answer this question. One example I have seen many times is as follows:
int noOfItems = SPContext.Current.List.Items.Count;
This code tells us the number of data items in the list. To get this result, we have to retrieve all the items in the list from the content database. The following shows the internal execution process of the object model when the above Code accesses the Count attribute:
This is not a problem with small lists, but it is still relatively fast to query. However, when the list grows several times or the custom code has never been tested on the actual data, this will become a problem.
In this case, Microsoft provides another attribute called itemcount for splist. The correct code should be:
int noOfItems = SPContext.Current.List.ItemCount;
In this case, Sharepoint only queries a single record of the lists data table in the content database. The number of data items in the list is stored in redundancy to obtain this information without querying the entire alluserdata table (all Sharepoint list items are saved here.
Case 2: Use splist to display projects in the List?
The SharePoint object model provides multiple methods to traverse data items in the Sharepoint list. I once saw this method in a actually running SharePoint application-it may work normally on a developer's machine or on a very small list, however, once executed on a list of more than several hundred data items, this method may cause a fatal performance problem. Let's take a look at the code snippet, which is used in a webpart to obtain the first 100 data items from the Sharepoint list of the current context:
SPList activeList = SPContext.Current.List;
for(int i=0;i<100 && i
SPListItem listItem = activeList.Items[i];
htmlWriter.Write(listItem["Title"]);
}
Suppose there are at least 100 data items in this list-how many times does this code access the database to obtain the 100 title values of the first 100 Sharepoint list data items? You may be surprised. After analyzing the execution process of the above Code, when you see the database view, a total of 200 calls were made to the database:
The reason is that in each loop, when accessing the items attribute, we request a new splistitemcollection object. The items attribute is not cached, so it always requests all data items repeatedly from the database. The following is the first loop iteration:
Correct Method
The correct method is to store the return value of the items attribute in a splistitemcollection variable. In this way, the database only queries once, and we can traverse the result set stored in the Set object. The modified sample code is as follows:
SPListItemCollection items = SPContext.Current.List.Items;
for(int i=0;i<100 && i
SPListItem listItem = items[i];
htmlWriter.Write(listItem["Title"]);
}
You can also use the foreach loop Statement, which is compiled into similar code and fully utilizes the ienumerable interface of the items set. The following describes how a new loop is processed internally:
Case 3: Use spquery and spview to request only the data you actually need
One of the major performance problems we can see in any type of application that must process data from databases is that there is too much data to access. The request requires more information than the data size required for the current use case, resulting in additional:
- Database Query overhead to collect request information
- Communication overhead between databases and applications
- Memory overhead on databases and applications
Looking back at the previous two cases, you will find that the executed SQL statement always selects all data items from the requested Sharepoint list. You can see that the select clause is written as follows: Select top 2147483648 ......
Limit the number of returned rows
When you access data items in the Sharepoint list, you can use the spquery. rowlimit attribute to obtain a limited number of result sets.
The following is an example:
SPQuery query = new SPQuery();
query.RowLimit = 100;
SPListItemCollection items = SPContext.Current.List.GetItems(query);
for (int itemIx=0;itemIx
SPListItem listItem = items[itemIx];
}
Use the spquery object in the splist. getitems method to generate results that contain the following select clause.
In the previous example, we have restricted the number of data items to be obtained. However, we still requested all columns defined in the Sharepoint list. If you really want to display all columns to the end user, or you want all columns to complete some calculations, there is no problem. However, in most cases, we only need a few rather than all columns.
Restrict search Columns
There are two ways to restrict the columns to be retrieved from the database:
- Use SharePoint view: spview
- Use the spquery. viewfields attribute
Therefore, the preceding sample code can be modified in the following two ways:
SPQuery query = new SPQuery(SPContext.Current.CurrentView.View);
Or
SPQuery query = new SPQuery();
query.ViewFields = "";
In both scenarios, the Select clause only contains the fields defined in the SharePoint view. These fields are referenced in the viewfields attribute respectively:
Case 4: Use spquery to pagination Sharepoint list data items
The Sharepoint list can contain thousands of data items. We have all heard that in order to obtain better list performance, we should not exceed the limit of 2000 entries. When the limit is exceeded, the performance is indeed affected. There are some ways to overcome this limit, that is, using index columns and views.
In addition to these factors, data in the access list is also important. As explained in previous cases, accessing only the data you need can greatly reduce the pressure on the SharePoint content database. In addition, the SharePoint object model provides some additional features to enhance the ability to access list data items.
Data paging is one of the technologies, which we are already familiar with in rich client applications or web applications, using data grids in a similar way. Paging allows end users to conveniently navigate data and, if implemented correctly, reduces the load on low-level databases.
The spquery object provides the listitemcollectionposition attribute, through which you can set the start position of the query page. Rowlimit allows you to set the number of data items to be obtained on each page. Let's look at some sample code:
SPQuery query = new SPQuery();
query.RowLimit = 10; // that is our page size
do
{
SPListItemCollection items = SPContext.Current.List.GetItems(query);
// do something with the page result
// set the position cursor for the next iteration
query.ListItemCollectionPosition = items.ListItemCollectionPosition;
} while (query.ListItemCollectionPosition != null)
Splist. getitems executes this query. Only 10 data items are returned each time getitems is called. The listitemcollectionposition attribute provided by splistitemcollection is like a pointer on the Sharepoint list. This attribute can be used for page traversal to define the starting point of the next page. The following illustration shows the activities of the database:
Take a closer look at the SQL statement presented to us. It combines the select top and where clauses to obtain data items on a page:
Case 5: update a large number of Sharepoint list data items
Previous use cases focus on reading and accessing data items stored in the Sharepoint list. Now let's discuss how to better update or add new data items. Because the SharePoint object model provides a variety of interfaces, we can choose from multiple methods.
The first obvious way to add and update data items in the Sharepoint list is splistitem. Update. To obtain a list data item, you can query an existing data item and use splistitemcollection. Add to add a new one.
Let's take a look at the following example:
for (int itemIx=0;itemIx<100;itemIx++) {
SPListItem newItem = items.Add();
// fill all the individual fields with values
newItem.Update();
}
After analyzing this code, we can see that every call to the update method actually calls the internal method splistitem. addorupdateitem. It actually calls a stored procedure to complete this task:
We can see that it takes 100 seconds to add 4.5 data items to my list.
Replace a single update with batch update
If you must update a large number of data items, we strongly recommend that you do not use the update method independently on each data item. Instead, use the batch update function processbatchdata provided by spweb.
Processbatchdata executes the batch method defined in XML format. Here is a good article explaining how to use batch update. By using batch update, You can implement the above example as follows:
StringBuilder query = new StringBuilder();
for (int itemIx=0;itemIx<100;itemIx++) {
query.AppendFormat("" +
"{1}" +
"New" +
"Save" +
"{2}" +
"", itemIx, listGuid, someValue, "urn:schemas-microsoft-com:office:office#");
}
SPContext.Current.Web.ProcessBatchData(
"" +
"{0}", query.ToString())
Add the same 100 data items through processbatchdata. By analyzing the internal mechanism, we know how much time is spent in the update process:
The comparison between the two update Methods shows that we can achieve a huge performance improvement through batch update:
Note:
Batch update is recommended only when a large number of updates are executed. However, consider the cost of creating batch update XML:
- Make sure to use stringbuilder instead of connecting some independent string objects.
- Separate batch update calls to ensure that the generated XML is small enough without memory overflow exceptions. When I executed the above 50000 batch update examples, I encountered OOM (memory overflow ).
Case 6: Which one is my slowest list, how are they used and why are they slow?
We know that the performance of the Sharepoint list decreases with the increase in the number of data items stored in it, and is also related to how the list is filtered during display. You can find many articles and blog posts about the limit of 2000 data items in each list.
To do the right thing for good performance, first you need to understand the current usage and analyze performance problems.
There are several ways to calculate the current access statistics for your SharePoint application. You can analyze IIS log files or use the SharePoint usage report feature (SharePoint usage reporting feature ).
The simplest way to monitor the list performance is to analyze the HTTP response times of the URLs of each Sharepoint list and SharePoint view. The format of a Sharepoint URL is similar to http: // servername/site/{listname}/{viewname}. aspx.
To analyze it, We can group requests based on these two tags. I use the business transaction feature of dynatrace to group the captured purepath's based on regular expressions.
The results show us which lists and views are most frequently used and how they behave.
In addition, analyze the HTTP request, which only provides accurate data for pages that display a specific list or view, we can analyze the list usage of custom web parts or custom pages, which are more accessible than a single list or view, it is also more likely to access the list in a specific filtering method.
We can also analyze the interaction with the SharePoint object model, such as the use of sprequest. renderviewashtml used to present the list and view, and the access to splist and spview. The following illustration shows the usage and performance metrics of the sprequest method call:
The figure above shows the internal guid of the list. Each list and view is uniquely identified by guid. This is another way to find the real list name: You can paste the guid into the URL to edit the list and view settings. The following is an example:
http://servername/_layouts/listedit.aspx?List={GUID} (GUID must be URL-Encoded).
This provides us with another way to open the content database and query the alllists data table. This data table contains guid and list name.
Why is the list slow?
Now, as we know which lists and views are frequently accessed, we can focus on those that make performance decline. To improve the end user experience, we should focus on the list with the most frequent access, instead of the list with occasional access.
The list is slow for many reasons:
- Too many data items are displayed in the List View.
- Too many data items are stored in the list without filtering or indexing columns.
- Inefficient data access for custom Web Components
Conclusion
The SharePoint object model provides an easy and flexible way to expand SharePoint applications. This framework provides different mechanisms to access and edit data stored in the Sharepoint list. However, not every possible method is desirable for every use case scenario. Understanding the internal principles of the SharePoint object model can make the SharePoint application we created run better and the performance is more scalable.
About the author
Andreas Grabner, as a technical strategic decision maker, works at dynatrace software. His role belongs to department R & D, which affects dynatrace product decision-making and works closely with key customers to implement performance management solutions throughout the application lifecycle. Andreas Grabner has 10 years of architecture and development experience in the Java and. Net fields.