Thoroughly understand the caching mechanism of browsers

Source: Internet
Author: User
Tags browser cache

Overview

Browser caching mechanism is that we say HTTP caching mechanism, the mechanism is based on the HTTP message cache identity, so in the analysis of browser caching mechanism, we first use the text to introduce the HTTP message, HTTP messages are divided into two: HTTP request (requests) message, Message format: Request line –http Header (Common information header, request Head, entity header) – Request message body (only post only message body), the following figure

HTTP response (Response) message format: state line –http Header (Common information header, response Head, entity header)-response message body, as shown below

Note: The general information header refers to the header fields supported by both the request and response messages, namely Cache-control, Connection, Date, Pragma, transfer-encoding, Upgrade, Via, and the entity header is the Entity header field of the entity information. Respectively, allow, Content-base, content-encoding, Content-language, Content-length, Content-location, CONTENT-MD5, Content-range, Content-type, Etag, Expires, Last-modified, Extension-header. Here just for the sake of understanding, the general information header, the response header/request header, the entity headers are grouped into HTTP headers.

The above concept here we do not do long explanation, only simple introduction, interested children's shoes can be studied by themselves. Caching Process Analysis

The browser communicates with the server in response mode, that is, the browser initiates the HTTP request – the server responds to the request. Then the browser first launched the request to the server to get the results of the request, according to the response message in the HTTP header cache identification, decide whether to cache the results, is the request results and cache identification in the browser cache, the simple process is as follows:

From the above figure we can know:

Each time the browser initiates a request, it looks in the browser cache for the result of the request and the cached identity

Each time the browser gets its return request, the result and cache ID are stored in the browser cache

The above two points is the key to the browser caching mechanism, he ensured that each request cache and read, as long as we understand the browser cache usage rules, then all the problems will be solved, this article will be around this detailed analysis. For your understanding, here we divide the caching process into two parts based on whether we need to restart the HTTP request to the server, which is mandatory caching and negotiation caching. Force Caching

Force caching is the process of looking up the request result to the browser cache and determining whether to use the cached result according to the cached rules of the result, there are three main types of forced caching (the negotiation cache process is temporarily not analyzed), as follows:

Without this cache result and cache identity, forcing the cache to fail, the request is made directly to the server (consistent with the first request), as shown in the following figure:

The cache and cache identities exist, but the result has been invalidated, forcing the cache to fail, using the Negotiate cache (pending analysis), as shown below

The cache result and cache identity exist, and the result has not been invalidated, forcing the cache to take effect and returning the result directly, as shown in the following figure

So what is the cache rule that enforces caching.

When the browser initiates a request to the server, the server returns the cache rule to the browser in the HTTP header of the HTTP response message and the request result, which controls the fields of the forced cache, respectively, expires and Cache-control. Among them, Cache-control priority is higher than expires. Expires

Expires is a field in which the http/1.0 controls the caching of a Web page, the value of which is the expiration time that the server returns the request result cache, that is, when the request is launched again, the cached results are used directly if the client's time is less than the expires value.

Expires is a http/1.0 field, but now the browser defaults to http/1.1, then the page cache in http/1.1 is controlled by the expires.

Http/1.1,expire has been replaced by Cache-control, because the principle of expires control caching is to use the time of the client versus the time returned by the server, so if the client and the server end up for some reason (for example, the time zone is different , the client and the service side have a side of the time is not accurate error, then forced cache will be directly invalidated, so that the existence of forced caching is meaningless, then Cache-control is how to control it. Cache-control

In http/1.1, Cache-control is the most important rule, mainly used to control the caching of Web pages, the main values are:

Public: All content will be cached (both client and proxy servers can be cached)

Private: All content only client can cache, cache-control default value

No-cache: Client cache content, but whether or not to use caching requires a negotiated cache to verify the decision

No-store: All content is not cached, that is, no forced caching is used, and no negotiation cache is used

Max-age=xxx (XXX is numeric): Cached content will expire after xxx seconds

Next, we'll look directly at an example, as follows:

From the above example we can know:

The time value of the expires in the HTTP response message is an absolute value

Cache-control in HTTP response message is max-age=600, which is relative value

Because the priority of the Cache-control is more than expires, the cache is directly based on the value of Cache-control, meaning that the request is launched again within 600 seconds, and the cache result is used directly, forcing the cache to take effect.

Note: When it is not possible to determine whether the client's time is synchronized with the server's time, Cache-control is a better choice than expires, so there is only cache-control in effect.

After understanding the process of forcing caching, we expand our thinking:

Where the browser's cache resides, and how to determine in the browser whether the mandatory cache is in effect.


Here we take the request of the blog as an example, the request of the status code is gray is to use the force cache, the request corresponding size value represents the location of the cache, respectively, from memory cache and from disk cache.

What then is the from memory cache and from disk cache respectively represented? When will you use the from disk cache and when will you use the From memory cache?

From memory cache represents the use of in-memory caching, from disk cache represents the use of the cache on the hard disk, the browser read the cache in the order of memory–> disk.

Although I have been directly to the conclusion, but I believe that a lot of people do not understand this, then we have a detailed analysis of the cache read problem, here still let my blog as an example to analyze:
Visit the https://heyingye.github.io/–> 200–> close the blog tab –> reopen https://heyingye.github.io/–> (from disk cache) > Refresh –> (from memory cache)

The process is as follows:

Visit https://heyingye.github.io/

Close a Blog's tab page

Re-open https://heyingye.github.io/

Refresh

See here may be a small partner asked, the last step refresh, not at the same time exist from disk cache and from memory cache.

For this issue, we need to understand the memory cache (from memory cache) and the hard disk cache (from cache) as follows:

Memory cache (from memory cache): Memory cache has two features, namely fast read and timeliness:

Fast read: The memory cache will compile the parsed file, directly into the process of memory, occupy the process of a certain amount of memory resources to facilitate the next run when the use of fast reading.

Timeliness: Once the process is closed, the process's memory is emptied.

Hard disk cache (from disk cache): The hard disk cache is directly to the cache to write to the hard disk file, read cache requires the cache stored in the hard disk file for I/O operations, and then reparse the cache content, read complex, faster than the memory cache slow.

In the browser, the browser will be in the JS and pictures such as file parsing directly into the memory cache, then when refreshing the page only directly from the memory cache read (from memory cache), while the CSS file will be stored in the hard disk file, so each rendering page will need to read from the hard disk cache ( From disk cache). Negotiate Cache

Negotiation cache is to force cache invalidation, the browser carries the cache identity to the server to initiate a request, the server based on the cache identity to determine whether to use the process of caching, there are mainly the following two cases:

The negotiation cache takes effect and returns 304, as follows

Negotiated cache invalidation, returns 200 and the result of the request result, as follows

Similarly, the identity of the negotiation cache is also returned to the browser in the HTTP header of the response message with the request result, and the fields that control the negotiated cache are: Last-modified/if-modified-since and Etag/if-none-match, where etag/ If-none-match priority is higher than last-modified/if-modified-since. last-modified/if-modified-since

Last-modified is when the server responds to a request, returns the time when the resource file was last modified at the server, as follows.

If-modified-since is when the client initiates the request again, carrying the last-modified value returned from the last request, which tells the server the last modified time the resource was requested to return last. When the server receives the request and discovers that the request header contains the If-modified-since field, it compares the if-modified-since's field value against the last time the resource was modified at the server. If the server's resource is last modified more than the If-modified-since field value, the resource is returned again, the status code is 200, or 304 is returned, and the cached file is not updated on behalf of the resource, as follows.
Etag/if-none-match

ETag is a unique identity (generated by the server) that returns the current resource file when the server responds to the request, as follows.

If-none-match is when the client initiates the request again, carrying the unique identity ETag value returned with the last request, which tells the server the unique identity value returned by the resource's last request. When the server receives the request and discovers that the request header contains If-none-match, it compares the If-none-match field value against the ETag value of the resource on the server, returns 304, represents the resource is not updated, continues using the cached file, and returns the resource file if it is inconsistent , the status code is 200, as follows.

Note: Etag/if-none-match priority is higher than last-modified/if-modified-since, but only etag/if-none-match takes effect. Summary

The force cache takes precedence over the negotiation cache, and if the force cache (expires and Cache-control) takes effect, the cache is used directly, and if it does not take effect, the cache is negotiated (Last-modified/if-modified-since and ETag/ If-none-match), the negotiation cache is determined by the server whether to use the cache, if the negotiation cache is invalid, then the cache on behalf of the request invalidation, retrieve the request results, and then into the browser cache, the entry into force return 304, continue to use the cache, the main process is as follows:


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.