web| Program | skills | performance
The simplicity of using asp.net to write Web applications is unbelievable. Because of this simplicity, many developers will not spend time designing their application structures for better performance. In this article, I'll describe 10 tips for writing high-performance WEB applications. But I'm not going to limit these suggestions to asp.net applications, as they are just part of the WEB application. This article does not serve as an authoritative guide to performance tuning of Web applications-a whole book I'm afraid will not be able to easily explain the problem. Consider this article as a good starting point.
I used to like rock climbing before I became a workaholic. Before doing any big rock climbing, I will first take a closer look at the route in the guide and read the advice from previous visitors. However, no matter how good the guide is, you need a real climbing experience before you can try a particularly challenging climb. Similarly, when you are faced with a problem fixing a performance problem or running a high throughput site, you can only learn how to write high-performance Web applications.
My personal experience came from the ASP.net department in Microsoft as an infrastructure program manager, during which I ran and managed Www.ASP.NET to help design the structure of community servers, a community server that is known as several
asp.net application (combined to a platform of ASP.net forums,. Text and Ngallery). I'm sure some of the techniques that have helped me will certainly help you.
You should consider dividing an application into several logical tiers. You may have heard of the term 3-layer (or n-tier) physical architecture. These are usually a defined architectural approach that physically separates functionality from process and/or hardware. It's easy to add more hardware when the system needs to be expanded. However, there is a performance degradation associated with process and machine jumps and should therefore be avoided. So, if possible, run the ASP.net page and its related components together in the same application.
Because of code separation and the boundaries between layers, using WEB services or remoting can degrade performance by 20% or more.
The data tier is a bit different because, in general, it's best to have hardware dedicated to the database. However, the cost of a process jumping to a database is still high, so the performance of the data tier is the first thing you should consider when optimizing your code.
Before you dive into the performance fixes for your application, first make sure you dissect your application to find out what the problem is. Major performance counters, such as counters that represent the percentage of time required to perform garbage collection, are also useful for finding out where an application is spending its primary time. However, the location of the time spent is usually very not intuitive.
This article describes two types of performance improvements: large optimizations (such as using asp.net caching), and small optimizations that replicate themselves. These small optimizations are sometimes particularly interesting. You'll get a lot of time when you make a small change to your code. With large optimizations, you may see a larger leap in overall performance. With small optimizations, it may only save a few milliseconds for a particular request, but a significant improvement can be achieved by adding all the requests each day.
data-tier performance
When it comes to application performance tuning, there is a test strip that can be used to prioritize work: Does the code access the database? If so, what is the frequency? Note that this same test can also be applied to code that uses WEB services or remoting, but this article does not cover these content.
If a database request is required in a particular code path, and you think you want to first optimize other areas (such as String operations), stop and then perform this test. If your performance problem is not very serious, it's a good idea to spend some time optimizing the time spent on the database, the amount of data returned, and the round-trip frequency to and from the database.
With these general information in view, let's take a look at 10 tips that might help improve application performance. First, I want to talk about the changes that might make the most difference.
Tip 1-return multiple result sets
Review your database code to see if there are multiple request paths to enter the database. Each such round-trip reduces the number of requests per second that the application can provide. By returning multiple result sets in a database request, you can save the total length of time required to communicate with the database. It also makes the system more scalable because it reduces the ability to manage requests for database servers.
Although you can use dynamic SQL to return multiple result sets, I prefer to use stored procedures. There are some controversies about whether business logic should reside in stored procedures, but I believe that if the logic in the stored procedure can constrain the return of data (reducing the size of the dataset, shortening the time spent on the network, and not having to filter the logical layer of data), you should be in favour of doing so.
When you populate a strongly typed business class with the SqlCommand instance and its ExecuteReader method, you can move the result set pointer forward by calling NextResult. Figure 1 shows an example of populating several ArrayList with a type class. Returning only the data you need from the database will further reduce the memory allocation on the server.
Figure 1 extracting multiple resultsets from a DataReader
Read the The ResultSet
Reader = command. ExecuteReader ();
Read the data from that resultset
while (reader. Read ()) {
Suppliers. ADD (Populatesupplierfromidatareader (reader));
}
Read the next ResultSet
Reader. NextResult ();
Read the data from that second resultset
while (reader. Read ()) {
Products. ADD (Populateproductfromidatareader (reader));
}
Tip 2-Paging data access
The ASP.net DataGrid has a good feature: data paging support. When paging is enabled in the DataGrid, a fixed number of records is displayed at a time. In addition, the paging UI is displayed at the bottom of the DataGrid to navigate between records. The paging UI enables you to navigate forward and backward between the displayed data and to display a fixed number of records at a time.
And there's a little twist. Paging using the DataGrid requires that all data be bound to the grid. For example, if your data tier needs to return all the data, the DataGrid will filter all the records that are displayed based on the current page. If 100,000 records are returned when paging through the DataGrid, 99,975 records are discarded for each request (assuming 25 records per page size). As the number of records increases, the performance of the application is affected because more and more data must be sent for each request.
An excellent way to write better paging code is to use stored procedures. Figure 2 shows a sample stored procedure for paging to the Orders table in the Northwind database. In short, all you have to do now is pass the page index and page size. The appropriate result set is then computed and returned.
Figure 2 Paging through the Orders Table
CREATE PROCEDURE northwind_orderspaged
(
@PageIndex int,
@PageSize int
)
As
BEGIN
DECLARE @PageLowerBound int
DECLARE @PageUpperBound int
DECLARE @RowsToReturn int
--The ROWCOUNT
SET @RowsToReturn = @PageSize * (@PageIndex + 1)
SET RowCount @RowsToReturn
--Set the page bounds
SET @PageLowerBound = @PageSize * @PageIndex
SET @PageUpperBound = @PageLowerBound + @PageSize + 1
--Create a temp table to store the select results
CREATE TABLE #PageIndex
(
IndexID int IDENTITY (1, 1) not NULL,
OrderID int
)
--Insert into the temp table
INSERT into #PageIndex (OrderID)
SELECT
OrderID
From
Orders
ORDER BY
OrderID DESC
--Return Total count
SELECT COUNT (OrderID) from Orders
--Return paged results
SELECT
O.*
From
Orders O,
#PageIndex PageIndex
WHERE
O.orderid = Pageindex.orderid and
Pageindex.indexid > @PageLowerBound and
Pageindex.indexid @PageUpperBound
ORDER BY
Pageindex.indexid
End
In the Community server, we have written a page-and-sever control to complete all the data paging. As you'll see, I'm using the idea discussed in Tip 1, which returns two result sets from a stored procedure: the total number of records and the requested data.
The total number of returned records may vary depending on the query being executed. For example, the WHERE clause can be used to constrain the returned data. To calculate the total number of pages that are displayed in the paging UI, you must know the total numbers of records to return. For example, if you have a total of 1,000,000 records and you want to filter it to 1000 records using a WHERE clause, the paging logic needs to know the total number of records to render the paging UI correctly.
Tip 3-Connection pool
Setting up a TCP connection between a WEB application and a SQL Server can be a very resource-consuming operation. Microsoft developers have been able to use connection pooling for some time so far, allowing them to reuse database connections. Instead of setting up a new TCP connection for each request, they set up a new connection only if there are no connections in the connection pool. When the connection is closed, it returns to the connection pool, where it maintains a connection to the database rather than completely destroying the TCP connection.
Of course, you need to be careful whether a leak connection will occur. Be sure to close these connections when you are finished using the connection. Repeat: No matter what anyone does to Microsoft? NET Framework, be sure to call Close or Dispose explicitly on the connection when you are finished using the connection. Do not trust the common language Runtime (CLR) to clear and close the connection for you at a predetermined time. Although the CLR eventually destroys the class and forces the connection to close, it is not guaranteed when garbage collection against the object actually occurs.
To use connection pooling in the most optimized way, you need to follow some rules. First open the connection, perform the action, and then close the connection. If you must, you can turn the connection on and off multiple times for each request (preferably with tip 1), but do not keep the connection open all the time and pass it in and out in a variety of different ways. Second, use the same connection string (if you use integrated authentication, use the same thread identity). If you do not use the same connection string, such as customizing the connection string based on the logged-on user, you will not be able to get the same optimization value that the connection pool provides. If you use integrated authentication and also simulate a large number of users, the efficiency of connection pooling can be greatly reduced. . NET CLR Data Performance counters can be useful when trying to track any performance issues associated with connection pooling.
Whenever an application connects to a resource, such as a database running in another process, you should focus on the time spent connecting the resource, the time it takes to send or retrieve data, and the number of round-trip trips to optimize. Optimizing any kind of process jump in your application is the primary point in getting better performance.
The application layer contains the logic to connect the data layer and transform the data into meaningful class instances and business processes. For example, a community server where you populate a forums or threads collection, apply a business rule (such as permissions), and most importantly, execute caching logic in it.
tip 4-asp.net Cache API
Before writing the line of application code, one of the first things to do is to design the structure of the application layer to maximize the use of the ASP.net caching feature.
If your component is to run in an ASP.net application, simply include a System.Web.dll reference in the application project. When you need to access the cache, use the Httpruntime.cache property (this object is also accessible through Page.cache and Httpcontext.cache).
There are several rules for caching data. First, this is a good alternative to using caching if the data may be used more than once. Second, if the data is generic and not specific to a particular request or user, it is also a good alternative to using caching. If the data is user-specific or requested, but long lived, it can still be cached, but it may not be used frequently. Third, one rule that is often overlooked is that sometimes you may be caching too much. Typically on a x86 computer, to reduce the chance of an out-of-memory error, you want to run a process with a private byte no higher than 800MB. So there should be a limit to caching. In other words, you may be able to reuse a calculated result, but if the calculation takes 10 parameters, you might try to cache 10 permutations, which can cause trouble. One of the most common support requirements for asp.net is an out-of-memory error caused by excessive caching, especially for large datasets.
There are several excellent features of caching that you need to know about. First, the cache implements the least recently used algorithm, enabling ASP.net to force cache cleanup-Automatically delete unused items from the cache-while the memory is running inefficiently. Second, the cache supports expired dependencies that can be forced to fail. These dependencies include time, keys, and files. Time is often used, but for ASP.net 2.0, a new, more powerful type of failure has been introduced: Database Cache invalidation. It means that items in the cache are automatically deleted when the data in the database changes. For more information about database cache invalidation, see MSDN? Magazine July 2004 's Dino Esposito cutting Edge column.
Tip 5-per-Request Cache
In the earlier part of this article, I mentioned that some minor improvements that often traverse the code path may result in larger overall performance gains. For these little improvements, one of them is definitely my favorite, which I call "per-request caching."
The caching API is designed to cache data for a longer period of time, or to cache certain conditions, but each request cache means that only the data is cached as the duration of the request. For each request, a particular code path is frequently accessed, but the data is only extracted, applied, modified, or updated once. This sounds a bit theoretical, so let's give a concrete example.
In the Community Server Forum application, each sever control used on the page requires personalized data to determine what appearance to use, what style sheet to use, and other personalized data. Some of these data can be cached for a long time, but some data is only fetched once for each request, and then reused several times during the execution of the request, such as the appearance of the control.
To achieve each request cache, use the ASP.net HttpContext. For each request, a HttpContext instance is created that can be accessed from anywhere in the HttpContext.Current property during the request. The HttpContext class has a special Items collection property; objects and data added to this items collection are cached only for the duration of the request. Just as you can use caching to store frequently accessed data, you can also use Httpcontext.items to store data that is used only on a per-request basis. The logic behind it is very simple: the data is added to the Httpcontext.items collection when it does not exist, and in subsequent lookups it simply returns the data in the Httpcontext.items.
Tip 6-Background processing
The path to the code should be as fast as possible, right? There may be times when you feel that there are a lot of resources required to perform a task that is executed on a per-request or every n request. Sending e-mail or analyzing and validating incoming data is some example of this.
When parsing asp.net forums 1.0 and rebuilding the content that made up the community server, we found that the code path to add new postings was very slow. Each time you add a new post, the application first needs to make sure that there are no duplicate postings, and then you must use the bad word filter to parse the post, analyze the posted word Fu Tu, add tags to the post and index it, add the post to the appropriate queue when you request it, verify the attachment, and post Send an e-mail notification to all subscribers immediately. It is clear that this involves a lot of operations.
The study found that most of the time was spent on indexing logic and sending emails. Indexing a post is a time-consuming operation and people find that the built-in System.Web.Mail feature connects to the SMYP server and then sends e-mail messages continuously. When the number of subscribers to a particular post or subject area increases, the time required to perform the Addpost feature is also growing.
No e-mail indexing is required for each request. Ideally, we would like to batch this operation, index 25 posts at a time, or send all emails once every five minutes. We decided to use the code previously used to prototype the data cache invalidation, which was used to eventually enter Visual Studio? 2005 of the content.
The Timer class in the System.Threading namespace is useful, but not very well known in the. NET Framework, at least for WEB developers. Once created, the Timer class will invoke the specified callback for a thread in the ThreadPool at a configurable interval. This means that you can set up your code so that it can be executed without an incoming request to the ASP.net application, which is ideal for background processing. You can also perform actions such as indexing or sending e-mail in this background process.
However, there are several problems with this technique. If an application domain is unloaded, the timer instance stops triggering its events. In addition, because the CLR has a hard standard for the number of threads per process, it is possible that a server load is heavy, where a timer may not have a thread that can be completed on its basis, and may cause delays to some extent. ASP.net attempts to minimize the chance of this happening by retaining a certain number of available threads in the process and using only part of the bus thread for request processing. However, if you have a lot of asynchronous operations, this may be a problem.
There is not enough room to place the code, but you can download an understandable example of the URL www.rob-howard.net. Take a look at the slides and demos in the Blackbelt TechEd 2004 demo.
Tip 7-page output caching and proxy server
asp.net is your presentation layer (or your presentation layer), which consists of pages, user controls, server controls (HttpHandlers and httpmodules), and what they generate. If you have a asp.net page that generates output (HTML, XML, images, or any other data), and when you run this code for each request, it generates the same output, you have an excellent alternative to the page output cache.
Add this row content to the top of the page <%@ page OutputCache varybyparams= "None" duration= "60"%>
You can efficiently generate an output for this page, and then reuse it multiple times for a maximum of 60 seconds, when the page will be rerun and the output will be added to the ASP.net cache again. This behavior can also be accomplished by using some low-level programmatic APIs. There are several configurable settings for the output cache, such as the VaryByParams property just mentioned. VaryByParams is just being requested, but it also allows you to specify an HTTP GET or http POST parameter to change the cache entry. For example, just set the varybyparam= "a" to default.aspx? Report=1 or default.aspx? report=2 for output caching. You can also specify additional parameters by specifying a semicolon-delimited list.
Many people don't know when to use output caching, and the ASP.net page also generates some HTTP headers downstream of the cache server, such as the headers used by Microsoft Internet Security and Acceleration Server or Akamai. After the HTTP cache headers have been set, the documents can be cached on those network resources, and client requests can be satisfied without having to return to the original server.
Therefore, using page output caching does not make your application more efficient, but it may reduce the load on the server because the downstream caching technology caches the document. Of course, this may be just anonymous content; once it is downstream, you will never see the requests again and will no longer be able to authenticate to prevent access to it.
Tip 8-run IIS 6.0 (as long as the kernel cache is used)
If you are not running IIS 6.0 (Windows Server 2003), you are missing some of the good performance enhancements in the Microsoft WEB server. In tip 7, I discussed the output cache. In IIS 5.0, the request is passed through IIS and then into the asp.net. When the cache is involved, the HttpModule in ASP.net receives the request and returns the contents of the cache.
If you are using IIS 6.0, you will find a nice little feature called kernel caching, which does not require any code changes to asp.net. When the request is output-cached by asp.net, the IIS kernel cache receives a copy of the cached data. When a request comes from a network driver, the kernel-level driver (no context switch to user mode) receives the request and, if cached, flushes the cached data to the response and completes execution. This means that when you use the kernel-mode cache with IIS and the asp.net output cache, you see a performance result that is unbelievable. During ASP.net's Visual Studio 2005 development, I was once a program manager in charge of ASP.net performance. The developers do the work, but I want to see all the reports that are going on every day. Kernel-mode caching results are always the most interesting. The most common feature is that the network is full of requests/responses, while IIS runs at only about 5% CPU usage. This is shocking! Of course there are some other reasons to use IIS 6.0, but the kernel-mode cache is one of the most obvious.
Tip 9-use Gzip compression
Although using gzip is not necessarily a server performance technique (because you may see increased CPU usage), using gzip compression can reduce the number of bytes sent by the server. This makes people feel that the page speed is faster, and also reduces the amount of bandwidth. Depending on what data is sent, how much can be compressed, and whether the client browser supports it (IIS will only send gzip-compressed content to clients that support gzip compression, such as Internet Explorer 6.0 and Firefox), your server can serve more requests per second. In fact, almost every time you reduce the amount of data you return, you increase the number of requests per second.
Gzip compression is already built into IIS 6.0, and its performance is much better than the gzip compression used in IIS 5.0, which is good news. Unfortunately, when you try to turn on gzip compression in IIS 6.0, you may not be able to find the setting in the IIS Properties dialog. The IIS team placed the outstanding gzip functionality on the server, but forgot to include a management UI for enabling the feature. To enable gzip compression, you must drill down into the XML configuration settings of IIS 6.0 (this will not cause heart weakness). Incidentally, thanks to Scott Forsyth of OrcsWeb, he helped me raise the issue of Www.asp.net servers hosted on OrcsWeb.
This article does not describe the steps, please read Brad Wilson's article, the URL is IIS6 Compression. There is also a knowledge base article about enabling compression for ASPX, which is enable ASPX Compression in IIS. However, you should be aware that, due to some implementation details, dynamic compression and kernel caching cannot exist at the same time in IIS 6.0.
Tip 10-Server control view state
View state is an interesting name that represents the asp.net that stores some state data in the hidden output field of the generated page. When the page is posted back to the server, the server can parse, validate, and apply this view state data back to the control tree for that page. View state is a very powerful feature because it allows the state to be persisted with the client, and it does not require cookies or server memory to save this state. Many ASP.net server controls use view state to maintain settings that are created during interaction with page elements, such as the current page that is displayed when the data is paginated.
However, there are some drawbacks to using view state. First, when a service or request page is used, it increases the total load on the page. Additional overhead occurs when the view state data that is posted back to the server is serialized or deserialized. Finally, view state increases the memory allocation on the server.
Several server controls have a tendency to overuse view state, even when they are not needed, most notably the DataGrid. The default behavior of the ViewState property is enabled, but you can close it at the control or page level if you do not need it. Within a control, you can set it globally by simply setting the EnableViewState property to False or by using the following settings in the page:
<%@ Page enableviewstate= "false"%>
If you do not postback the page, or you always regenerate the controls on the page for each request, you should disable view state at the page level.
I've told you some tips that I think are helpful when writing high-performance asp.net applications. As I mentioned earlier in this article, this is a preliminary guide and is not the final result of ASP.net performance. (For information about improving the performance of asp.net applications, see improving asp.net performance.) Only through personal experience can you find the best way to solve specific performance problems. However, in your journey, these tips should give you some good guidance. In software development, there is almost no absolute thing; each application is unique.