A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
In writing WEB-based applications that provide dynamic information to users. Many users are observed to visit a particular page, but the dynamic information does not change.
If dynamically generated Web pages are frequently requested and are built with a large amount of system resources, how can you improve the response time of such web pages?
The following factors affect the system within this context, and these factors must be reconciled when considering a solution to the problem:
1. Generating dynamic Web pages consumes a wide variety of system resources. When a WEB server receives a page request, it typically must retrieve the requested information from an external data source, such as a database or Web Service. Access to these resources is typically done through a limited pool of resources, such as database connections, sockets, or file descriptors. Because WEB servers typically handle many concurrent requests, contention for these shared resources can delay page requests until resources become available. After the request is sent to an external data source, the result must still be converted to HTML code for display.
2. An obvious way to make the system faster is to buy more hardware. This approach can be tempting because the hardware is cheap (or the vendor says) and you don't have to change the program. On the other hand, more hardware can only be useful for performance before it reaches its physical limit. Network limits, such as data transfer rates, or wait times make these physical limits more noticeable.
3, the second way to make the system faster is to reduce the work of system processing. This approach requires more work from developers, but can greatly improve performance. The following section explores the challenges that this approach poses.
For frequently accessed and generated resource-intensive pages, we typically use caching (page caching and data caching) in a way that addresses the following issues to consider when using this scenario:
1, the time of the cache
2, is the full page cache, or part of the cache
3, the variable factors of the page, such as parameters, language, equipment, the changes brought about by different
Use page caching for dynamic WEB pages that frequently access but change less frequently.
The basic structure of the page cache is relatively simple. The WEB server maintains a local data store that contains pre-generated pages (see Figure 1).
Figure 1: Basic configuration of the page cache
The following sequence diagram illustrates why page caching can improve performance. The first sequence diagram (see Figure 2) describes the initial state of the desired page that has not been cached (that is, the so-called cache misses). In this case, the WEB server must access the database, generate an HTML page, store it in the cache, and then return it to the client browser. Note that this process is slightly slower than not caching because it performs the following additional steps:
Any of these steps should not take a long time compared to database access and HTML generation. However, because additional processing is required in this case, you must make sure that the cache is hit more than once (2) after the system finishes the step associated with the cache miss.
Figure 2: Sequence of Cache misses (when the page is not in the cache)
In the cache hit scenario shown in Figure 3, the page is already in the cache. Cache hits save loops by skipping database access, page generation, and page storage.
Figure 3: Sequence of Cache Hits (when the page is in the cache)
The caching strategy is a wide-ranging topic that cannot be discussed in detail in a single pattern. However, it is important to discuss the most relevant considerations when implementing a solution that includes Page Cache.
The Page Cache solution contains the following key mechanisms:
page (or page fragment) storage
The following paragraphs discuss these mechanisms separately.
The page cache must store pre-generated pages so that the system can retrieve them quickly. You also want to be able to store as many pages as possible in order to increase the chance of cache hits. When you are doing storage, you often need to weigh the speed, size, and cost. Smaller caches can reside in memory and can be very fast. Larger disk storage caches provide a large amount of storage, but are slow.
To find the best balance of speed and size, you must carefully determine which pages are cached. Ideally, you should cache only pages that are frequently accessed and ignore pages that are seldom used.
The next most important decision is how large the fragments in the cache should be. Storing the full page quickly displays the page after the page hits, because the system retrieves the page from the cache and immediately sends it to the client without having to perform any other action. However, if some parts of the page change frequently and other parts are not, storing the full page may result in a lot of storage being added. Storing smaller fragments increases the chance of a page hit, but requires more storage overhead and more CPU consumption.
It is also important to consider how the system can find pages in the cache. The easiest way to find a page is to use a URL. If the page does not depend on any other factors, it can be retrieved from the cache simply by comparing the requested URL with the URL of the page that is stored in the cache. However, this situation rarely occurs. Almost all dynamic pages are built on parameters. Therefore, the system may have to store multiple instances of a page based on parameters. You can use Vary-by-parameter Caching to implement this type of cache, where page content depends on parameters.
It is also important for the system to keep items in the cache for a long time. Storing pages at a fixed time is the simplest method. However, this method may not necessarily be sufficient. You can resolve these issues by associating the cache duration with external events. Some cache policies attempt to pre-generate pages in a low traffic cycle. This method can be very effective if you have predictable traffic patterns and can store pages long enough to avoid flushing during peak traffic times.
Page Cache has the following advantages and disadvantages:
1. Save the CPU cycles required to generate the page. For a large number of concurrent users, this results in shorter response times and increased scalability of the WEB server.
2. Eliminate unnecessary round trips to the database or other external data sources. This advantage is particularly important because these external sources typically provide only a limited number of concurrent connections that must be shared by all concurrent page requests in the resource pool. Frequent access to external data sources can quickly cause the WEB server to stop abruptly as a result of resource contention.
3, save the client connection. Each concurrent connection from the client browser to the WEB server consumes a limited amount of resources. The longer it takes to process a page request, the longer it takes to connect to the resource.
4. Support Concurrent access for many page requests. Because the page cache is primarily a read-only resource, it can be handled fairly easily by multithreading. Therefore, it prevents the resource contention that occurs when the system accesses an external data source. The only part that must be synchronized is the cache update, so considerations around the update frequency are the most critical to getting good performance.
5, improve the usability of the application. If the system needs to access an external data source to generate a page, it relies on the available data sources. Even when an external source becomes unavailable, the page cache page allows the system to pass the cached page to the client; The data may not be up to date, but it is probably better than no data at all.
1, the information displayed is not up to date. If the cache refresh mechanism is configured incorrectly, the WEB site may display invalid data, which can be misleading or even harmful.
2. Requires CPU and memory (RAM or disk) resources. Caching pages that are not viewed frequently, or setting a refresh interval that is too short, can lead to increased overhead and, in fact, lower server performance. As with all performance measures, use the actual measures and performance indicators for a thorough analysis to determine the correct settings. The downside of a hasty decision, such as caching each page, will outweigh the benefits.
3. Increase the complexity of the system and make it difficult to test and debug. In most cases, you should develop and test your application without caching, and then enable the caching option in the performance tuning phase.
4, need to pay attention to additional safety matters. The problem with caching is often overlooked. When a WEB server processes concurrent requests for confidential information from multiple users, it is important to avoid cross-cutting of these requests. Because the page cache is a global entity, misconfigured page caches may pass pages that were originally generated by another user to the browser.
5, can dynamically produce inconsistent response time. While it is certainly better to have a fast-delivery page in 99% than a slow pass page at a time, if the cache policy is optimized for cache hit ratios and the optimizations for cache misses are too low, it can cause a time-out that occurs at times of uncertainty.
in the ASP using an absolute expiration implementation in Page Cache
Build a WEB application in ASP., and you want to cache the page to improve performance. The alternate selection scheme presented in page cache has been evaluated.
Page caching increases the throughput of request responses by caching content generated from dynamic Web pages. By default, page caching is supported in ASP, but output from any given response will not be cached unless a valid expiration policy is defined. To define an expiration policy, you can use the low-level OutputCache API or advanced @OutputCache directives.
When page caching is enabled, the first GET request to the page creates a page cache entry. The page cache entry is used to respond to subsequent GET or HEAD requests until the cached response expires.
The page cache follows the page's expiration policy. If a page with an expiration policy of 60 seconds is cached, after 60 seconds, the page is removed from the output cache. If the cache receives another request after that time, it executes the page code and refreshes the cache. This expiration policy is called "absolute expiration", which means that the page has been in effect until a certain time.
The following example shows how the response is cached using the @outputcache directive:
<%@ OutputCache duration= "All" varybyparam= "None"%>
The example shows when the response was generated. To understand the role of the output cache, call the webpage and notice when the response is generated. Then refresh the page, and you will notice that the time has not changed, which means that the second response is provided from the cache.
The following line of code activates the page cache for the response:
<%@ OutputCache duration= "All" varybyparam= "None"%>
This directive only indicates that the page should be cached for 60 seconds and that the page will not change because of any GET or POST parameters. Requests received within the first 60 seconds are provided from the cache. After 60 seconds, the page is removed from the cache, and the next request caches the page again.
Using the absolute expiration mode in ASP. Page Cache has the following advantages and disadvantages:
1. This is by far the simplest method of page caching in ASP. If you want to analyze the usage patterns of your WEB application to determine which pages to cache, in many cases an absolute expiration may be sufficient, and it is undoubtedly a good start. Also, consider the variability of dynamic content on a Web page. For example, the expiration policy for weather pages can be 60 minutes, because the weather won't get very fast. However, pages that display stock quotes may not be cached at all. To determine the correct expiration time, you must know which pages are most frequently viewed and understand the variability of the data contained in the page.
2. You can set different expiration policies for different pages. In this way, you can cache only frequently accessed pages without wasting cache space on pages that are rarely accessed. You can also refresh pages that contain more frequent changes to the data than other data.
1, the dynamic content on the cached page may be invalid. This is because page expiration is based on time, not content. In the example described earlier, the time is displayed on the Web page after a few seconds. Because the page is built every 60 seconds, the seconds field is not valid for the moment after the page is built. The invalid data portion of this example is very small. For example, if you want to display financial quotes that are very time-sensitive and require high accuracy, consider adopting a caching strategy that ensures that you will never display invalid data.
2. This policy does not support the passing of parameters to Web pages. Dynamic Web pages are often determined by parameters. For example, a weather page might use a ZIP code as a parameter. Unless you want to create different pages and URLs for thousands of zip codes (for example, 42,000 zip codes in the United States), you cannot use absolute expiration to cache the page. Vary-by-parameter Caching solved the problem.
3. Absolute expiration is only applicable if the entire page remains unchanged. In many applications, most of the Web pages change infrequently (very well for caching), but are coupled with other parts of the page that are often changed (not cached). Because the absolute expiration mode caches only the entire Web page, it cannot take advantage of local changes like this. In these cases, page Fragment Caching may be a better choice because it can cache part of a Web page. The HTML framework provides another option for simulating page fragmentation. However, the framework has some known issues in the Web browser, such as navigation and printing problems.
4. Unable to refresh cached pages. The Web page remains in the cache until it expires or restarts the server. This makes the test a problem. In addition, caching can be problematic in situations where the data is rarely changed and, in the event that changes are never delayed. For example, updating the weather forecast every two hours may be sufficient for most of the time. However, if the hurricane is approaching, you may not want to wait two hours before updating the weather forecast.
5. You must modify the code on each page to change the expiration policy. Because expiration policies can only be changed in code, there is no mechanism to close the cache for the entire application.
6. Storing Web pages in the cache requires disk space on the server. In the example described earlier, the smaller pages do not require much disk space. However, as the content on each page and the number of pages in the cache increase, the Web server is required to provide more disk space.
The following pattern explains the other implementations of Page Cache:
Sliding Expiration Caching
For related page cache design and implementation policies, see the following patterns:
Page Data Caching
Page Fragment Caching
Using the Micrisoft.net Design Scenario Chapter III Web presentation mode Web mode cluster details page cache (page caching)
Start building with 50+ products and up to 12 months usage for Elastic Compute Service