HTTP caching mechanism [translate]

Source: Internet
Author: User
Tags http authentication browser cache

This article is translated from: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching, mainly used for personal records and sharing, if there are errors, please do not hesitate to correct, thank you!

The performance of Web sites and applications can is significantly improved by reusing previously fetched resources. Web caches reduce latency and network traffic and thus lessen the time needed to display a representation of a resource. By making with the HTTP caching, Web sites become more responsive.

By reusing acquired resources, you can dramatically improve the performance of your Web sites and applications. Because Web caching reduces latency and network traffic, it shortens the time it takes to present a resource. By using the HTTP caching mechanism, Web sites can achieve faster and more flexible responses.

Different kinds of caches different types of caches

Caching is a technique this stores a copy of a given resource and serves it back when requested. When a Web cache had a requested resource in its store, it intercepts the request and returns its copy instead of RE-DOWNL Oading from the originating server. This achieves several goals:it eases the load of the server that doesn ' t need to serve all clients itself, and it improve s performance by being closer to the client, i.e., it takes less time to transmit the resource back. For a Web site, it's a major component in achieving high performance. On the other side, it have to be configured properly as not all resources stay identical forever:it are important to cache A resource only until it changes, not longer.

Caching is a technique for saving a copy of a resource and using it directly on the next request. When a request is initiated, the Web cache determines whether a copy of the request has already been made (previously requested and cached), and if so, the cache intercepts the request and returns a copy of the request result directly in the cache, preventing the resource from being re-downloaded to the source server. Purpose of Caching: Reduce server pressure (the server does not have to service all clients every time), improve access efficiency (because the cache is closest to the client, you can directly provide a copy of the resource, but also save a lot of transmission time). For Web sites, caching is the most important component of building high-performance Web sites, but on the other hand, caching must be configured reasonably to achieve best results, because not all resources are permanently unchanged, so we need to ensure that a resource's cache is only valid when it is not changed.

There several kinds of caches:these can be grouped into the main categories:private or shared caches. A shared cache is a cache this stores responses for reuse by more than one user. A private cache is dedicated to a single user. This page would mostly talk about browser and proxies caches, but there is also gateway caches, CDN, reverse proxy caches an D load balancers that is deployed on Web servers for better reliability, performance and scaling of Web sites and web App Lications.

All different types of caches can be broadly categorized into two categories: private cache and shared cache. Copies of the resources stored in the shared cache are for all users (such as different browsers, different machines), whereas private caches are proprietary caches that are provided only to individual users (different users keep different private cache replicas). This article discusses only browser caching and proxy caching, but for now there are many other types of caches, such as gateway caching, CDN, reverse proxy caching, load balancing (load balancing is deployed on the server, providing more reliable, high-performance, and more scalable scenarios for multiple Web servers).

Private browser caches browser (private) cache

A private cache is dedicated to a single user. You might has seen "caching" in your browser ' s settings already. A Browser cache holds all documents downloaded via HTTP by the user. This cache was used to make visited documents available for Back/forward navigation, saving, Viewing-as-source, etc. Withou T requiring an additional trips to the server. It likewise improves offline browsing of cached content.

A private cache is a proprietary cache for a single user, and in general, you can see the "cache" option in your browser settings. The browser cache retains all of the document resources that the user downloads through HTTP, which can be used in advance/rewind, save, view source code, and so on, without having to re-access the server. Similarly, with caching, we can also implement offline browsing of documents and resources.

Shared proxy caches proxy (share) cache

A shared cache is a cache, stores responses to being reused by more than one user. For example, an ISP or your company might has set up a Web proxy as part of it local network infrastructure to serve man Y users So, popular resources is reused a number of times, reducing network traffic and latency.

The access results stored in the shared cache are provided for use by multiple users. For example, an ISP or your company might assemble a proxy for a local network, which caches the public resources requested by different users when they access the extranet, and when the other user accesses the same resource the next time, the cached resource is reused (it is no longer available to the source station). This reduces network browsing and latency.

Targets of caching operations cache operation targets

HTTP caching is optional, but reusing a cached resource is usually desirable. However, common HTTP caches is typically limited to caching responses to and could GET decline other methods. The primary cache key consists of the request method and target URI (oftentimes only, the URI is used as only GET requests is caching targets). Common forms of caching entries are:

Although HTTP caching is optional, it is generally required by everyone. The HTTP cache typically caches only get requests (which are not normally cached by other requests), and the primary key for the cache consists of the request method and the destination URI (usually only the URI, because only the GET request is generally cached). The usual cache entries are:

    • Successful results of a retrieval request:a 200 (OK) response to a GET request containing a resource like HTML Docume NTS, images or files.
    • Result data for a successful query request: A get response with a status code of 200 (the result may contain resource data such as: HTML document, picture, or file, etc.)
    • Permanent redirects:a 301 (Moved permanently) response.
    • Permanent jump: Response of the Status code to 301 (Moved permanently)
    • Error responses:a 404 (not Found) result page.
    • Error returned, document not present: Response with status code 404 (Not Found)
    • Incomplete results:a 206 (Partial Content) response.
    • Incomplete result data: The response of the Status code to 206 (Partial Content) (the result returned by a request originating from the range header, range is used to get only a portion of the document)
    • Responses other than GET if something suitable for use as a cache key is defined.
    • Results of other non-GET requests (if the results are more appropriate as a cache)

A cache entry might also consist of multiple stored responses differentiated by a secondary key, if the request is target of content negotiation. For more details see the information about the Vary Headerbelow.

Controlling cachingthe Cache-controlHeader

The Cache-Control http/1.1 general-header field is used to specify directives for caching mechanisms in both requests and response S. Use this header to define your caching policies with the variety of directives it provides.

No Cache storage at all

The cache should not a store anything about the client request or server response. A request is sent to the server and a full response are downloaded each and every time.

Cache-Control: no-storeCache-Control: no-cache, no-store, must-revalidate
No Caching

A cache would send the request to the origin server for validation before releasing a cached copy.

Cache-Control: no-cache
Private and public caches

The ' public ' directive indicates that the response is cached by any cache. This can is useful, if pages with an HTTP authentication or response status codes that aren ' t normally cacheable, should now Be cached. On the other hand, "private" indicates that the response are intended for a single user only and must not being stored by a sh Ared Cache. A Private browser cache may store, the response in this case.

Cache-Control: privateCache-Control: public
Expiration

The most important directive this is " max-age=<seconds> " which the maximum amount of time a resource would be considered fresh. Contrary Expires to, this directive was relative to the time of the request. For the files in the application that would not be change, you can usually add aggressive caching. This includes the static files such as images, CSS files and JavaScript files, for example.

For more details, see also the freshness section below.

Cache-Control: max-age=31536000
Validation

When using the " must-revalidate directive", the cache must verify the status of the stale resources before using it and expired ones should not being used. For more details, see the Validation section below.

Cache-Control: must-revalidate
The PragmaHeader

Pragmais a http/1.0 header, are not specified for HTTP responses and are therefore not a reliable replacement for the general HTTP /1.1 Cache-Control Header, although it does behave Cache-Control: no-cache the same as, if the Cache-Control header field is omitted in a request. Use only for Pragma backwards compatibility with http/1.0 clients.

Freshness

Once A resource is stored in a cache, it could theoretically was served by the cache forever. Caches has finite storage so items is periodically removed from storage. This process is calledCache Eviction. On the other side, some resources could change on the server so the cache should is updated. As HTTP is a client-server protocol, servers can ' t contact caches and clients when a resource change; They has the communicate an expiration time for the resource. Before this expiration time, the resource isFresh; After its expiration time, the resource ifStale. Eviction algorithms often privileges fresh resources over stale resources. Note that a stale resource are not evicted or ignored; When the cache receives a request for a stale resource, it forwards this requests with aIf-None-MatchTo check if it is ' t in fact still fresh. If So, the server returns a304(not Modified) header without sending the body of the requested resource, saving some bandwidth.

Example of this process with a shared cache proxy:

The freshness lifetime is calculated based on several headers. If a " Cache-control: max-age=N " header is specified and then the freshness lifetime is equal to N. If the header is not a present, which is very often the case, it's checked if the Expires header is present. If an Expires header is exists, then its value minus the value of the Date header determines the freshness lifetime. Finally, if neither header is present, look for a Last-Modified header. If the header is present and then the cache's freshness lifetime is equal to the value of the Date header minus the value O F The Last-modified header divided by 10.
The expiration time is computed as follows:

expirationTime = responseTime + freshnessLifetime - currentAge

The Where is the time at responseTime which the response were received according to the browser.

Revved Resources

The more we use cached resources, the better the responsiveness and the performance of a Web site would be. To optimize this, good practices recommend to set expiration times as far in the future as possible. This is the possible on resources which is regularly updated, or often, but was problematic for resources the is rarely and I nfrequently updated. They is the resources that would benefit the most from caching resources, yet this make them very difficult to update. This was typical of the technical resources included and linked from each Web pages:javascript and CSS files change infreq uently, but when they change you want them to be updated quickly.

Web developers invented a technique that Steve sounders called  revving [1]. Infrequently updated files is named in specific way:in their URLs, usually in the filename, a revision (or version) Numbe R is added. That's the new revision of this resource be considered as a resource on its own that  never  changes A nd that can has an expiration time very far in the future, usually one year or even more. In order to has the new versions, all the links to them must are changed, that's the drawback of this method:additional Complexity that's usually taken care by the tool chain used by WEB developers. When the infrequently variable resources change they induce a additional change to often variable resources. When these is read, the new versions of the others is also read.

This technique have an additional benefit:updating and cached resources at the same time would not leads to the situation WH Ere the out-dated version of one resource is used in combination with the new version of the other one. This is very important when Web sites has CSS stylesheets or JS scripts that has mutual dependencies, i.e., they depend On all other because they refer to the same HTML elements.

The revision version added to revved resources doesn ' t need to is a classical revision string like 1.1.3, or even a monoto nously growing suite of number. It can be anything this prevent collisions, like a hash or a date.

Cache validation

Revalidation is triggered when the user presses the reload button. It is also triggered under normal browsing if the cached response includes the " Cache-control: must-revalidate " header. Another factor is the cache validation preferences in the Advanced->Cache Preferences panel. There is an option to force a validation each time a document is loaded.

When a cached documents expiration time had been reached, it is either validated or fetched again. Validation can only occur if the server provided either a strong validator or a weak validator.

Etags

The ETag response header is a opaque-to-the-useragent value, can be used as a strong validator. That's means that a HTTP user-agent, such as the browser, does isn't know what this string is represents and can ' t predict what I TS value would be. If ETag The header is part of response for a resource, the client can issue an in the header of the future If-None-Match requests– In order to validate the cached resource.

The Last-Modified response header can be used as a weak validator. It is considered weak because it's only have 1-second resolution. If Last-Modified the header is present in a response and then the client can issue an If-Modified-Since request header to validate the cached doc Ument.

When a validation request was made, the server can either ignore the validation request and response with a normal 200 , or it can return (with a 304 Not Modified empty body) to instruct the browser to use its cached copy. The latter response can also include headers that update the expiration time of the cached document.

Varying Responses

The Vary HTTP response header determines how to match future request headers to decide whether a cached response can be u Sed rather than requesting a fresh one from the origin server.

When a cache receives a request so can be satisfied to a cached response that have a Vary header field, it must not use That cached response unless all headers fields as nominated by the Vary header match in both the original (cached) request and the new request.

This can is useful for serving content dynamically, for example. When using Vary: User-Agent the header, caching servers should consider the user agent when deciding whether to serve the page from CA Che. If you is serving different content to mobile users, it can help you to avoid the a cache may mistakenly serve a desktop Version of your site to your mobile users. In addition, it can help Google and other search engines to discover the mobile version of a page, and might also tell the M that's no cloaking is intended.

Vary: User-Agent

Because the User-Agent header value is different ("varies") for mobile and desktop clients, caches'll not being used to serve MO Bile content mistakenly to desktop users or vice versa.

See Also
    • RFC 7234:hypertext Transfer Protocol (http/1.1): Caching
    • Caching Tutorial–mark Nottingham
    • HTTP Caching–ilya Grigorik
    • Redbot, a tool to check your cache-related HTTP headers.

HTTP caching mechanism [translate]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.