"Web" yslow optimization rule (iii) Add cache control Header

Source: Internet
Author: User

Add expires and Cache-control headers

Expires and Cache-control are actually cache control headers in HTTP, which mainly affect the client's request behavior and server-side response.

Many of this article comes from the HTTP authoritative guide, and if you have any questions, please note.

A Basic concepts of caching

Here the cache, single-finger web cache. When a Web request arrives at the cache, if there is a cached copy locally and the cache is not expired, the data or document can be read locally, so that you can:

1. Reduce redundant data transfer to a certain extent, reduce the server traffic and pressure.

2. Alleviate the problem of network bottleneck, can load the page faster without needing more bandwidth.

3. Reduce the requirements of the original server, the server can respond faster, avoid overloading and peak access.

4. Reduce the delay caused by distance.

The cache can be proprietary to a single user, or it can be shared by thousands of users. Where the proprietary cache is called a private cache , the shared cache is called the public cache . Proprietary caches are common, such as a built-in private cache for common browsers-most browsers provide caching capabilities that reduce the load time of two requests by caching frequently used documents in the disk and memory of personal PC computers. A shared cache is implemented through a cache proxy Server , which can accept access from multiple users and provide caching capabilities to more users, thereby reducing redundant traffic.

For an HTTP request that contains cache control, the caching process consists of 7 steps. respectively:

1. Receive。 Cache read incoming Request messages from the network
2. parsing。 The cache parses the message and extracts the URL and HTTP request header.
3. Enquiry。 See if the local copy is available, and if not, get a copy and save it locally.
4. Expiration Check。 Check to see if the cached copy of the document is fresh enough, and if not, ask the server if it has an update.
5. Build Response。 Build a response message using the new header and the already cached body
6. Send。 The cache sends the response back to the client over the network.

7. Log . Optionally, create the appropriate log.

Second, cache related header

Cache-related headers mainly include Cache-control header, expires header, If-modified-since and If-none-match , and the latter two are mainly used for client re-authentication .

1. Cache-control Header and Expires header

Cache-control is one of the most important headers related to caching. It is important to note that it can appear either in the request header or in the response header. So, for the head, the first thing to know is whether it appears in the request header or in the response header, which is obviously different. To CSDN BBS home page For example, browser access to bbs.csdn.net, crawled to the request header is:

One of the cache-control:max-age=0 is to tell the cache system that this request has no expiration date. The response headers for the corresponding server are:

The max-age=0 indicates that the cache does not expire,private indicates that it is a private cache, andmust-revalidate indicates that the client must do a re-authentication before using the cache. Careful observation of the request and response header we also found that: the request header and the cache and the other head: If-none-match, the response header contains the ETag header. A detailed explanation of these two headers will be given later, before this is explained briefly below: Cache expiration. With special HTTP Cache-control and Expires headers, HTTP lets the original server contain the expiration time of each document, just as the shelf life of the food. Before the cached document expires, the cache system can build the response directly using the cache without having to request the server every time (unless the request header contains additional headers that block the cache). Cache-control:max-age=xxx is used to specify the lifetime of the document (in seconds), expires is the expiration time of the specified document, and the difference between the lifetime and the expiration is that the lifetime is used to specify the maximum usage time of the document starting from the generated document, The expiration time is specified at what time the document expires. Speaking of which, incidentally, because expires specifies an absolute expiration date instead of a number of seconds, because many servers have clocks that are not synchronized or incorrect, the expires header is not recommended. If the max-age and expires headers are also present in the response, theexpires header will be overwritten based on previous experience.

The value of Cache-control can be divided into the following categories according to different functions:

A Type of cache:

Public(Shared cache, cache proxy Server cache)
Private(a private cache, which cannot be cached by a shared cache proxy, can be cached by the user's proxy, such as a browser).

b Behavior of the cache:

No-store: Indicates that you do not want the cache system to keep any copies of the document, and the cache typically forwards a no-store response to the client and then deletes the object.
No-cache: The identity of the No-cache response can actually be slowed down in the local cache, except that it is not available to clients until the original server is re-authenticated.
must-revalidate: This header tells the caching system not to provide stale copies of this object without re-authenticating with the original server. If the cache is not available when the original server is must-revalidate, the cache must return a 504 Gateway timeout error.
Max-age: As already mentioned, this header is used to specify the maximum effective time after the document is generated. A shared cache response may also include a s-maxage header, which is used only for shared caches.

C Client behavior:

Min-fresh: The client receives a response time that is less than the current time plus the specified time.
Max-stale: The client can receive a response message that exceeds the timeout period.

As these headers are seldom used, they are not mentioned here.

2. If-modified-since and If-none-match client re-authentication

The conditional request method for HTTP can be efficiently re-authenticated by sending a conditional GET request to the original server, which provides the object body only if the document is not the same as the existing copy in the cache.

HTTP defines 5 conditional request headers, and the 2 most useful headers for cache re-authentication are if-modified-since and If-none-match.

A. if-modified-since: One of the most commonly used cache re-authentication headers, if the document has been modified after the specified date, the request is executed, and the new document carrying the new header is returned to the cache, and the new header contains a new expiration time. This header is usually used in conjunction with the last-modified header. Take Csdn's blog as an example, the request for a static resource in a blog request, such as a CSS request, the first request header is:

Does not contain the header of the if-modified-since, the server-side response contains the last modification time of the document (the time is based on the client as the if-modified-since of the subsequent request). As shown in the following:

In the second request, the If_modified-since request header is included:

In subsequent requests, because the document is not modified, the server echoes the 304 not modified response, and the cache system builds the response with the cached copy, which does not need to contain content-type and content-encoding. Transfer-encoding such as the header (because the document does not need to be retransmitted), as follows:

B. If-none-match: Entity label re-authentication. If-modified-since is a good solution to the re-validation of static documents, but in many cases it is not enough to simply re-verify using the last modified date of the document, which is due to:

1. Some documents may be rewritten periodically, but the data may be the same, so that although the contents of the document have not changed, the modification time of the document has changed.
2. Although some documents have been modified, the modifications are not important, so it is not necessary to update all the caches.
3. Some servers are unable to accurately determine the last modification time of a document or to support if-modified-since correctly (for example, some servers are string match comparisons instead of date comparisons using dates)
4. For real-time monitoring applications where document changes are less than 1s, the granularity of 1s is too large to require finer granularity control.

HTTP allows conditional get requests through the entity label ETag, which can be a version number, serial number, fingerprint, or checksum information for the document, and the ETag can have multiple. Take bbs.csdn.net home page For example, when you first request a document, the HTTP response of the server contains the ETag information for the document (the ETag information can be a MD5 summary of the document, and so on):

In subsequent requests, the client request header is brought with the If-none-match header:

If the ETag matches, the server echoes the 304 not modified response, instead the server returns a new document and a new ETag label.

3. Suggestions on YSlow optimization law

The principle of optimization in YSlow is:

A. For static content, add a longer expiration date for expires. For example, in CSDN, the expiration date for a static CSS file is the current time plus one weeks (that is, one weeks after expiration):

The disadvantage of this approach is that if the document is modified before the document expires, the browser is using the cached copy, because the date of the expires setting is not reached. This certainly poses many problems (such as the inability to update the client's UI in a timely manner). Most of the practice is to add an identity to your static resources, such as a version number, to update the reference to a static resource when the document changes, ensuring that the static resources used by the client are always up-to-date with the cached copy.

B. For dynamic content, set the appropriate Cache-control policy. There's a place where the pits are, what's the right strategy? Since there is no universal setting, we do not explain too much. Consider how to choose the right caching strategy for specific servers and applications.

Reference documents:

1. https://developer.yahoo.com/performance/rules.html

2. http://www.guojl.com/article/40/

3. http://www.cnblogs.com/cocowool/archive/2011/08/22/2149929.html

4. http://www.vktone.com/articles/http_browser_cache.html

5. http://blog.csdn.net/novofly/article/details/7613173

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.