Front-end Study Notes-HTTP cache, learning notes-HTTP Cache
Original address: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching? Hl = zh-cn
Caching and reusing the previously acquired resources is a key aspect of performance optimization.
Each browser has its own HTTP cache implementation function. You only need to ensure that each server provides the correct HTTP header command to indicate when the browser can cache the response and how long it can be cached.
HTTP and browser interaction request headers:
When the server returns a response, a set of HTTP headers are also sent to describe the response content type, length, cache command, verification token, and so on. For example, in the interaction, the server returns a 1024-byte response, indicating that the client caches the response for up to 120 seconds and provides a verification token ("x234dff "), it can be used to check whether the resource is modified after the response expires.
Verify cache response through ETag
- The server uses the ETag HTTP header to pass the verification token
- Verification tokens enable efficient resource check: No data is transmitted when the resource remains unchanged.
Verify the role of the token:
Assume that the browser initiates a new request to the resource 120 seconds after the resource is obtained for the first time. First, the browser checks the local cache and finds the previous response. Unfortunately, the response has expired and cannot be used by the browser. In this case, the browser can directly send a new request and obtain a new complete response. However, this is less efficient, because if the resources do not change, it makes no sense to download the information that is exactly the same as the existing information in the cache!
This is exactly the problem that the verification token (specified in the ETag header) is intended to solve.The random token generated and returned by the server is usually the hash value of the file content or another fingerprint. The client does not need to know how fingerprints are generated, but sends them to the server in the next request. If the fingerprint is still the same, it indicates that the resource has not changed and you can skip the download.. For example:
In the preceding example, the client automatically provides the ETag token in the "If-None-Match" HTTP request header. The server checks the token based on the current resource. If it does Not change, the server will return a "304 Not Modified" response, informing the browser that the response in the cache has Not changed, and the delay can be extended by 120 seconds. Please note that you do not need to download the response again, which saves time and bandwidth.
As a network developer, how can we use efficient re-verification? The browser will do all the work for us: it will automatically detect whether a verification token is specified before, and it will append the verification token to the sent request, in addition, it updates the cache timestamp when necessary based on the response received from the server.The only thing we need to do is ensure that the server provides the necessary ETag token. Check the server documentation for any necessary configuration flag.
Tip: the HTML5 Boilerplate project contains sample configuration files for all the most popular servers. Detailed annotations are provided for each configuration flag and setting. Find your favorite server in the list, find the appropriate settings, and copy/confirm that your server has configured the Recommended settings.
Cache-Control
- Each resource can define its Cache policy through the Cache-Control HTTP Header
- The Cache-Control command controls who can Cache the response under what conditions and how long the response can be cached.
From the perspective of performance optimization, the best request is a request that does not need to communicate with the server: The local copy of the response can be used to eliminate all network delays and avoid traffic fees for data transmission. To achieve this, the HTTP specification allows the server to return Cache-Control commands that Control how the browser and other intermediate caches Cache the response and how long the Cache takes.
Note: The Cache-Control header is defined in the HTTP/1.1 Standard and replaces the header (for example, Expires) previously used to define the response Cache Policy ). All modern browsers support Cache-Control, so using it is enough.
Command:
"No-cache" and "no-store"
"No-cache" indicates that you must first confirm with the server whether the returned response has changed before using the response to meet subsequent requests to the same website. Therefore, if an appropriate verification token (ETag) exists, no-cache initiates a round-trip communication to verify the cache response. However, if the resource does not change, downloading is avoided.
In contrast, "no-store" is much simpler. It directly disables browsers and all intermediate caches to store responses returned by any version, such as responses that contain personal private data or banking data. Each time a user requests this asset, a request is sent to the server and the complete response is downloaded.
"Public" and "private"
If the response is marked as "public", the response can be cached even if it has associated HTTP Authentication and even the response Status Code cannot be cached. In most cases, "public" is not required because explicit cache information (such as "max-age") already indicates that the response can be cached.
In contrast, the browser can cache "private" responses. However, these responses are usually only cached by a single user, so no intermediate cache is allowed to cache them. For example, a user's browser can cache HTML webpages containing users' private information, but CDN cannot.
"Max-age"
The command specifies the maximum time (in seconds) that the retrieved response can be reused from the request time ). For example, "max-age = 60" indicates that the response can be cached and reused in the next 60 seconds.
Define the best Cache-Control policy
Use the preceding decision tree to determine the optimal Cache Policy for the specific resources or a group of resources used by your application. Ideally, your goal should be to cache as many responses as possible on the client, cache as long as possible, and provide verification tokens for each response, to achieve efficient re-verification.
Cache-Control commands and instructions |
Max-age = 86400 |
The browser and any intermediate cache can cache the response (if it is a "public" Response) for up to 1 day (60 seconds x 60 Minutes x 24 hours ). |
Private, max-age = 600 |
The browser of the client can only cache the response for up to 10 minutes (60 seconds x 10 minutes ). |
No-store |
The cache response is not allowed. Each request must be completely obtained. |
According to HTTP Archive, in the top 300,000 websites (ranked by Alexa), almost half of all download responses can be cached by browsers, this can greatly reduce repeated web browsing and access. Of course, this does not mean that 50% of your application's resources can be cached. Some websites can cache more than 90% of resources, while other websites may have a lot of private or time-sensitive data that cannot be cached at all.
Review webpages, determine which resources can be cached, and ensure that they return the correct Cache-Control and ETag headers.
Cache discard and update response
- The local cache response will be used until the resource expires.
- You can embed a file content fingerprint in the URL to force the client to update the response to the new version.
- For optimal performance, each application needs to define its own cache hierarchy.
All HTTP requests sent by the browser are first routed to the browser cache to check whether the cache can be used to meet the valid response of the request. If a response is matched, the response is read from the cache, which avoids network latency and traffic fees generated by transmission.
However, what should I do if I want to update or discard the cache response?For example, if a visitor is told to cache a CSS style table for up to 24 hours (max-age = 86400), the designer has just submitted an update that you want all users to use. How do I notify all visitors who have an "obsolete" CSS cache copy to update its cache? You cannot change the resource URL.
After the browser caches the response, the cached version will be used until it expires (determined by max-age or expires), or until it is deleted from the cache for some other reason, for example, the user clears the browser cache. Therefore, when building a webpage, different users may eventually use different versions of the file. users who have just obtained the resource will use the response of the new version, but will cache the early (but still valid) the copy user will use the response of the old version.
Therefore, how can I get both fish and bear's paw: Client Cache and fast update?You can change the URL of a resource when its content changes and force the user to download a new response. Generally, you can embed the fingerprint or version number of the file into the file name, for example, style.X234dff. Css.
Because the cache policy of each resource can be defined, the "cache hierarchy" can be defined, which not only controls the cache time of each response, but also controls the speed at which visitors can view the new version. For illustration, let's analyze the above example:
- HTML is marked as "no-cache", which means that the browser always reverifies the document during each request and obtains the latest version when the content changes. In addition, embedding fingerprints in the URLs of CSS and JavaScript assets within the HTML Tag: if the content of these files changes, the HTML of the webpage will also change, A new copy of the HTML response is downloaded.
- Allows the browser and intermediate cache (such as CDN) to cache CSS, and sets the CSS to expire after one year. Please note that you can safely use the one-year "Forward expiration" because the file fingerprint is embedded in the file name: the URL will also change when CSS is updated.
- JavaScript is also set to expire after 1 year, but is marked as private, probably because some of the user's private data it contains should not be cached by CDN.
- The image cache does not contain the version or unique fingerprint and is set to expire after 1 day.
You can combine ETag, Cache-Control, and the unique website address to achieve more results: Long expiration time, location where Cache response can be controlled, and on-demand updates.
Cache check list
There is no optimal Cache Policy. You need to define and configure appropriate settings for each resource and the overall "cache hierarchy" based on the communication mode, the provided data type, and the specific data update requirements of the application ".
When developing a cache policy, you must keep in mind the following skills and methods:
- Use consistent URLs: If the same content is provided on different websites, the content will be obtained and stored multiple times. Note: The URL is case sensitive.
- Ensure that the server provides the authentication token (ETag): With the verification token, when the resources on the server do not change, you do not need to transmit the same bytes.
- Determine which resources can be cached in the intermediate Cache: Resources with identical responses to all users are suitable for caching by CDN and other intermediate caches.
- Determine the optimal cache cycle for each resource: Different resources may have different Update Requirements. Review and determine the appropriate max-age for each resource.
- Determine the cache hierarchy of the most suitable website: You can use a combination of resource URLs containing content fingerprints for HTML documents and a short time or no-cache cycle to control the client's update speed.
- Minimize agitation: Some resources are updated more frequently than other resources. If a specific part of a resource (such as a JavaScript function or CSS style set) is updated frequently, you can consider providing its code as a separate file. In this way, the rest of the content (such as the library code that is not frequently changed) can be obtained from the cache each time an update is obtained, thus minimizing the size of the downloaded content.