HTTP caching mechanism

Source: Internet
Author: User
Tags http 200 browser cache

in the Web development process, caching is a commonplace topic. This paper introduces the definition, function, classification and mechanism of the cache and its principle. Hope to be helpful to everyone, if there is a mistake, please also make corrections.

What is Web caching

according to Interpretation on MDN, caching refers to the technique of storing a copy of a specified resource and providing the copy instead of the source file the next time the resource is requested. When the Web cache discovers that the requested resource has been stored, it intercepts the request, returns a copy of the resource, and does not re-download the source server.

The role of caching

  1, reduce network bandwidth consumption, reduce operating costs

  2, reduce the server pressure. After setting the validity period of the network resource, the user can reuse the local cache, reduce the request to the source server, and reduce the pressure of the server indirectly. At the same time, the search engine crawler robot can reduce the crawl frequency according to the expiration mechanism, also can reduce the pressure of the server effectively.

  3, reduce network latency, speed up the page opening speed, improve the user experience.

Cache type

  1. Database Cache

in large-scale Web application development process, the database will be frequently queried, it is easy to overwhelm the database, we will generally put the first query data into memory for caching, the next time the query, directly from memory, instead of querying the database again, so as to improve the response efficiency. Common database caching schemes are memcached and Redis. The difference between the two is not discussed at this time.

2. Server-Side Caching

Proxy Server cache: a copy of the resource is saved when the proxy server forwards the response returned by the source server. It can be understood as a shared cache, and the response of a shared cache store can be used by multiple users. The advantage of a cache server is that it avoids the ability to forward resources from the source server multiple times, and the client can fetch resources from the cache server, and the source server does not have to process the same request multiple times.

CDN Cache: Also known as Gateway cache or reverse proxy cache. From the browser's point of view, the entire CDN is a source server.

  3. Client Cache

Also known as a browser cache or a private cache, the browser cache works according to a set of rules that are agreed with the server, and is checked once during the same session and determines that the cached copy is new enough. These rules are in the HTTP protocol headers and HTML pages are defined in meta tags. Note that the HTTP-EQUIV attribute of the META tag is supported only by some browsers and is not supported by all cache proxies because the agent does not parse the HTML content itself. Browser will dedicate a space on your hard drive to save a copy of the resource specifically for you. These caches provide backward/forward navigation for browsed documents , save Web pages, view source code, and so on, to avoid unnecessary requests to the server again. It can also provide offline browsing of cached content. If you browse the process, such as forward or backward, access to the same picture, these images can be recalled from the browser cache and immediately appear.

Caching policies

1. Cache Storage Policy

through the HTTP response header Cache-control the public, private, No-cache, Max-age, No-store to determine whether the HTTP response content can be cached by the client, the first 4 will cache the data locally, No-store does not cache any response data in the client (it is necessary to understand that No-cache only does not cache expired resources, but not cache them). The data is cached locally after the browser is not used directly, but first to determine whether the cache expires (see below).

2. Cache expiration Policy

Cache-control, expires indicates the validity period of the current resource, controls whether the client caches data directly from the browser or re-sends the request to the server for data access. The client confirms whether the cached data stored locally has expired (comparing max-age and age or expires and date) by both fields, thus deciding whether to send a request to the server to re-download a resource. It is important to note that the Cache-control priority is higher than expires, while the former overrides the latter. If Cache-control does not exist, it checks to see if it contains expires, and if expires does not exist, then the cache life is the value of date minus the value of last-modified divided by 10. Cache expiration just tells the client not to read the cache from the local, it is not useless, wait until the sending request to the source server confirmation, if the file has not been modified, then will continue to use (see below).

3. Cache Check Policy

Guest   after the user detects that the data expires or the browser refreshes , it sends an http request to the source server, and the server is not in a hurry to return the response. Instead, the last-modified and ETag are the first to determine if the resource has been changed.

last-modified/if-modified-since

Last-modified: The cached weak validator that indicates the last modification time for this response resource. When the Web server responds to a request, it tells the browser the last modification time of the resource.

If-modified-since: When a resource expires and the discovery resource has a last-modified declaration, it is requested again with the top if-modified-since to indicate the request time. When the Web server receives the request, it finds that the header if-modified-since is compared to the last modification time of the requested resource. If the last modification time is newer, indicating that the resource has been changed, then response to the entire resource content (written in the response message packet), HTTP 200, if the last modification time is older, indicating that the resource has no new modifications, the response to HTTP 304 (no body, save browsing ), Tell the browser to continue using the saved cache.

Etag/if-none-match

Etag: Cached Strong validator, when the Web server responds to a request, tells the browser that the current resource is uniquely identified on the server. In Apache, the value of the ETag, by default, is obtained by hashing the file's index section (INode), size, and last modified time (MTime).

If-none-match: When a resource expires (using the Max-age identified by Cache-control) and the discovery resource has a etage claim, it is requested again with the top If-none-match (etag value) to the Web server. When the Web server receives the request, it finds that the header if-none-match is compared to the corresponding check string for the requested resource and decides to return 200 or 304.

last-modified and ETag

Last-modified is enough to let the browser know if the local cache copy is new enough, why do I need an etag (entity identity)? The appearance of the ETag in HTTP1.1 is mainly to solve some problems that last-modified more difficult to solve.

The last modification of the last-modified callout can only be accurate to the second level, and if some files are modified multiple times within 1 seconds, it will not be able to accurately label the file modification time.

If some files are generated periodically, when the content does not change at all, but Last-modified has changed, causing the file to be unable to use the cache

There may be situations where the server is not getting the file modification time accurately or inconsistent with the proxy server time.

The ETag is a unique identifier on the server side of the server or the corresponding resource generated by the developer, allowing more accurate control of the cache. Last-modified and ETag can be used together, the server will first verify the etag, consistent with the case, will continue to compare to last-modified, and finally decide whether to return 304.

Summarize

reusing acquired resources can effectively improve the performance of websites and applications. Web caching reduces latency and network congestion, which in turn reduces the time it takes to display a resource. With the help of HTTP caching, Web sites become more responsive.

HTTP caching mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.