The following figure illustrates "what is an http1.1 Cache Policy" in the most concise way although the information package content is limited.
Cache and cache policies
Web Cache or proxy cache is a special HTTP proxy server. The cache reduces the transmission of redundant data, relieves bandwidth bottlenecks,
Reduces the distance latency.
The cache policy is when cache is used. How does client, proxy cache, and server work together to achieve correct and fast data transfer.
Before introducing cache policies, we need to clarify the concept
(A) cache hit
(B) cache miss
(C) cache re-verification hit
The reason for the above three situations is whether the cache is valid. Obviously, when the cache is valid, we certainly want to obtain data from the cache. How can we determine whether the cache is valid? This is the question to be discussed next-freshness detection. The two important headers are expires and cache-control: Max-age. expires comes from HTTP/1.0 + and cache-control comes from HTTP/1.1. Cache-control: Max-age defines the document's validity period, which is a relative time, for example, cache-control: Max-age = 3600, in seconds; expires specifies an absolute time. Obviously, relative time is much more reliable than absolute time. Because absolute time depends on the computer clock settings. But many times both are set, mainly because some clients do not support HTTP/1.1 and cannot recognize cache-control. This is a compatible policy. Of course, when both expires and cache-control exist, the cache-control priority is higher than expires.
We also need to make it clear that when the document expires, it is fresh and fresh, that is, the cache is consistent with the data of the original service. We acknowledge that, however, a single expires or cache-control header indicates that the cache has been deviated from the original server data and is invalid. This is obviously not a concern. When a document expires, the cache can ask the original server whether the document has changed after a certain period of time. If the content does not change, the cache only needs to get a new expiration time, the cache is marked as valid and the cached data is returned to the client. This means that the cache is re-verified. If the cached content changes, the cache needs to obtain new data information, update the old cache and send the new data to the client. This means that the cache is not hit. ~ This kind of cache asks the original server for "server verification ".
The most useful headers related to "server re-verification" are if-modified-since: <date> and if-None-Match: <tags>
If-modified-since and last-modified Server Response Header
When the cache needs to re-verify the cached documents, it will contain an IF-modified-since header with the date on which the cached copy was last modified, if the content is modified during this period, the last modification date will be different, and the original server will return a new document and a new expiration time; otherwise, a 304 not modified response will be returned, no document subject is returned, but a new expiration time is returned.
If-None-match OBJECT tag re-verification
If-None-match exists because if-Modify-since is sufficient to judge based on the last modification time of the document. The following situations fall within a reasonable demand range, such:
1. Some documents may be rewritten cyclically, but the data may be the same. In this way, although the content of the document has not changed, the modification time of the document has changed.
2. Although some documents have been modified, modification is not important, so you do not need to update all the caches.
3. Some servers cannot accurately determine the last modification time of the document or correctly support if-modified-since (for example, some servers use date string matching for comparison instead of date comparison)
4. For real-time monitoring applications with document changes less than 1 s, the granularity of 1 s is too large and requires more fine-grained control.
The etag tag is used for re-verification as the document version number, serial number, fingerprint, or verification information, and multiple etags can exist. When the document is accessed for the first time, the service response will contain the etag information, and then the client will add the latest etag information to if-None-match. If the etag matches, the server will respond to 304 not modified. Otherwise, the server will return a new document and a new etag.
Other cache header information: cache-control and Pragma cache-control
When cache-control is part of the request header, the value can be: max-age, Max-stale, Min-fresh, no-cache, no-store, no-transform, only-if-cached
Max-age: if the value of max-age is specified, the original server will not be accessed again within the time range. For example, cache-control: Max-age = 5 indicates that the server will not be accessed again within 5 seconds after the webpage is accessed;
Max-stale: the client can accept the response object that exceeds the freshness, But the precondition is that the response time must be earlier than Max-stale;
Min-fresh: receives cached objects whose fresh life period is greater than the current age + min-fresh value;
No-Cache: it is not to say that the cache will not be cached, but it will be cached, but the cache will verify the validity of the cache to the original server every time it provides response data to the client;
No-store: the response is not cached;
No-transform: the original statement in RFC is "the no-transform" request directive indicates that an intermediary (whether or not it implements a cache) must not transfor the payload ". When this field is specified, the payload cannot be modified in any stage. =. = Actually, I don't quite understand it, but I have an impression in my mind that the proxy can modify the HTTP header. This may be the case and will be verified ....
Only-if-cached: the client wants the response to come from the cache. Therefore, there are two types of responses: Data from the cache and 504 response;
When cache-control is part of the response header, the value can be: must-revalidate, no-cache, no-store, no-transform, public, private, proxy-revalidate, Max-age, S-maxage.
Must-revalidate: the cache can be used only after successful verification to the original server; otherwise, the system will respond to 504;
Public: Any cache proxy can cache server responses;
PRIVATE: the response is for private users and cannot be cached by a common cache proxy;
Proxy-revalidate: it also requires re-verification to the original server, but it is invalid for private cache;
S-maxage: Same as Max-age, but only used to share the cache;
Pragma
Like cache-control: No-cache, pramma: No-cache is compatible with HTTP 1.0, and cache-control: No-cache is provided by http1.1. Therefore, Pragma: No-Cache can be applied to http1.0 and http1.1, while cache-control: No-Cache can only be applied to http1.1.
Have you ever thought that we have discussed all HTTP/1.1 Policies? What would happen if our policies hit an old server or client? In line with one principle, we are determined not to return error messages, so as to ensure correctness at the cost of efficiency.
If you leave aside the details, the cache policy is summarized as follows: the most effective and peer-to-peer information exchange among client, cache, and server. The most effective includes effective time and accuracy, and effective time by maximizing the use of cache to reduce communication costs (when detecting freshness, you only need to send header information, only when necessary can the document subject be sent. To reduce the amount of data transmitted by communication, follow this principle. The effectiveness of accuracy is also a measure of freshness detection; the most peer information exchange is that the server returns a header information, and the client must have the corresponding header information to send back the most response.
Http1.1 Cache Policy