Thoroughly understand the HTTP caching mechanism and principles

Source: Internet
Author: User
Tags md5 send cookies browser cache chrome developer
Preface

Http caching mechanism as an important means of Web performance optimization, for the students engaged in web development, should be a knowledge system library of a basic link, at the same time for aspiring to become a front-end architect of the students is the necessary knowledge and skills.
But for many front-end students, just know that the browser will cache the requested static file, but why is cached, how the cache is effective, but not very clear.
Here, I will try to use simple and clear text, such as the introduction of the system HTTP caching mechanism, I hope that the correct understanding of the front-end cache to help. before introducing the HTTP cache, as a knowledge matting, first a brief introduction to the HTTP message

An HTTP message is a block of data that is sent and responded to when a browser communicates with a server.
The browser requests data from the server, sends the request message, the server returns the data to the browser, and returns the response (response) message.
Message information is mainly divided into two parts
1. Contains the header (header) ———————— of the property – additional information (cookies, cache information, etc.) and cache-related rule information that is contained in the header
2. Contains the main part of the data (body) ——————— –http request the partial cache rule parsing that really wants to transmit

To facilitate understanding, we believe that browsers have a cached database for storing cached information.
When the client requests data for the first time, there is no corresponding cached data in the cached database, the server needs to be requested, and after the server returns, the data is stored in the cached database.

There are a number of rules for HTTP caching, based on the need to restart requests to the server classification, I divided it into two categories (forced caching, comparison cache)
Before introducing the two rules in detail, let's have a simple understanding of the two rules by way of a sequence diagram.

when cached data already exists, it is only based on forced caching, and the process of requesting data is as follows

when cached data already exists, it is based only on a comparison cache, and the process of requesting data is as follows

A classmate who does not know much about the caching mechanism may ask that a request should be sent to the server, regardless of whether or not the cache is being used, based on a comparison cache , and what to do with the cache.
This problem, we put aside, after the text in detail about each of the caching rules, will bring you the answer.

We can see the difference between the two types of caching rules, which, if enforced, do not need to interact with the server, and the comparison cache, whether or not it takes effect, needs to interact with the service side.
Two types of caching rules can exist at the same time, forcing the cache priority higher than the contrast cache, that is, when the rule that enforces caching is enforced, the cache is no longer executed if the cache is in effect, using caching directly. Force Caching

From what we learned above, forced caching , when the cached data is not invalidated, can use the cached data directly, then how the browser determines whether the cached data is invalidated.
We know that when the browser requests data from the server when the data is not cached, the server returns the data and cache rules together, and the cache rule information is contained in the response header .

For forced caching, there are two fields in the response header to indicate the failure rule (Expires/cache-control)
Using the chrome Developer tool, it is obvious that the network request is in effect when the force cache is enforced

Expires

The value of expires is the expiration time returned by the server, that is, the requested time is less than the expiration time returned by the server, and the cached data is used directly.
However, expires is HTTP 1.0, the default browser now defaults to use HTTP 1.1, so its role is largely ignored.
Another problem is that the expiration time is generated by the server, but the client time may be in error with the service-side time, which results in a cache hit error.
So the HTTP 1.1 version, using cache-control substitution. Cache-control

Cache-control is the most important rule. Common values are private, public, No-cache, Max-age,no-store, the default is private. Private: The client can cache public: both the client and the proxy server can be cached (the front end of the classmate, you can think public and private is the same) Max-age=xxx: Cached content will expire after xxx seconds No-cache: Need to use contrast cache to validate cached data (described later) No-store: All content is not cached, forced cache, Comparison cache does not trigger (for front-end development, the more cache the better, so ...) Basically and it says 886)

As an example

Figure Cache-control only specifies Max-age, so default is private, cache time is 31.536 million seconds (365 days)
That is, requesting this data again within 365 days will directly capture the data in the cached database and use it directly. Compare Cache

comparison caching , as the name suggests, requires a comparison to determine whether the cache can be used.
When the browser requests data for the first time, the server returns the cache identity to the client with the data, which the client backs up to the cached database.
When the data is requested again, the client sends back the cached identity to the server, which is judged by the cache identity, returns a 304 status code, notifies the client that it is successful, and can use the cached data. First visit:

Visit again:

By comparing the two graphs, we can clearly find that the status code is 304 when the contrast cache is in effect, and the message size and request time are greatly reduced.
The reason is that the server, after the identification comparison, returns only the header part, notifies the client using the cache by status code, and no longer needs to return the message body part to the client.

For comparison caching, the delivery of cache identities is a focus we need to understand, which is passed between the request header and the response header,
Divided into two kinds of identity delivery, next, we introduce separately. last-modified/if-modified-since

Last-modified: When the server responds to the request, it tells the browser how long the resource was last modified.

If-modified-since:
When the server is requested again, this field notifies the server of the last time the resource returned by the server when it was last requested.
When the server receives the request, it finds that a header if-modified-since is compared with the last modification time of the requested resource.
If the resources of the last modification time is greater than if-modified-since, indicating that the resources have been changed, then respond to the entire resource content, return status code 200;
If the resource's last modification time is less than or equal to if-modified-since, and the resource has no new modifications, the response is HTTP 304, informing the browser to continue using the saved cache.

Etag/if-none-match (priority is higher than last-modified/if-modified-since)

Etag:
When the server responds to the request, it tells the browser that the current resource is uniquely identified on the server (the build rule is determined by the server).

If-none-match:
When the server is requested again, this field notifies the server of the unique identity of the cached data for the client segment.
When the server receives a request, it finds that a header if-none-match is compared to the unique identity of the requested resource.
Different, indicating that resources have been changed, then respond to the entire resource content, return status code 200;
Same, indicates that the resource has no new modifications, then responds to HTTP 304 and tells the browser to continue using the saved cache.

Summary

For forced caching, the server notifies the browser of a cache time, in the cache time, the next request, directly with the cache, not within the time, to perform a comparison caching strategy.
For comparison caching, the ETag and last-modified in the cached information are sent to the server by request, and the browser uses the cache directly when the server verifies that the 304 status code is returned.

The first time the browser requests :

when the browser requests again :

If there is an error in the article, I hope that the small partners can understand, I would like to be able to correct the HTTP cache-related concepts HTTP header information HTTP response header information

Request headers: The browser sends the requested data to the server, resources.
Response header: The server responds to the browser data and tells the browser who I am and what I want you to do. For example, I am nginx, give you the resources are correct 200 or wrong 404, I want you to cache how long. Common Request headers:

accept:text/html,image/* the type accept-charset:iso-8859-1 the browser can receive The type of encoding the browser can receive accept-encoding:gzip,compress the browser can receive the compression encoding type Accept-lan                                               GUAGE:EN-US,ZH-CN Browser can receive language and country type host:www.lks.cn:80                         Browser-requested host and Port If-modified-since:tue, 18:23:51 GMT a page cache time referer:http://www.lks.cn/index.html Which page the request comes from user-agent:mozilla/4.0 compatible; MSIE 5.5; Windows NT 5.0 Browser-related information Cookie: Information sent by the browser staging server Connecti                       on:close1.0/keep-alive1.1 the features of the HTTP request version Date:tue 18:23:51GMT                                                            Request the time of the Web site Allow:get request method Get common also have post Keep-alive:5 The time of the connection; 5 ConnecWhether the tion:keep-alive is a long connection cache-control:max-age=300 Maximum cache time of 300s
Common response headers:
location:http://www.lks.cn/index.html controls which page the browser displays Server:apache Nginx The type of server Content-encoding:gzip the compression encoding sent by the server Conten                                           T-LENGTH:80 server sends the displayed byte code length CONTENT-LANGUAGE:ZH-CN The server sends the content the language and the country name Content-type:image/jpeg; Charset=utf-8 Server send content Type and encoding type Last-modified:tue, June 18:23:51GMT Server last modified time Refresh : 1;url=http://www.lks.cn Control browser 1 seconds after the forwarding URL to point to the page content-disposition:attachment; Filename=lks.jpg server control Browser send download mode open File transfer-encoding:chunked server block pass data to client S ET-COOKIE:SS=Q0=5LB_NQ;                                                                  Path=/search server to send cookies related information Expires:-1             The expiration time of the resource, provided to the browser cache data,-1 expires Forever Cache-control:no-cache                              Tell the browser, be sure to return to the server checksum, regardless of whether there is no cached data.                           Pragma:no-cache server control browser do not cache Web pages connection:close/keep-alivehttp The requested version of the feature Date:tue, one of the 18:23:51 GMT response to the site's time ETag: "ihfdgkdgnp98 HDFG Identification of resource entity (unique identifier, similar to MD5 value, file has modify MD5 is different)
explanation of caching-related headers: Expires

A GMT time, trying to tell the browser that, within this date, you can trust and use a copy of the corresponding cache, but the disadvantage is that the client date is inaccurate. It can cause a failure. Pragma:no-cache

This is the regular head in the http1.0, acting the same as the http1.1 Cache-control:no-cache last-modified

A GMT time that informs the requested entity of the last modification time. Use to verify that the cached copy of the browser is still trusted. The two criteria request headers associated with:

If-modified-since
This is also more common when it is meaningful only in the Get method. If the entity returns a 304 after the specified time, or returns a regular GET request response (for example, 200), the static file does not have a modified return of 304 is good, because it is only back to the server to check if there are changes, and does not like 200 to request data again.

If-unmodified-since:
If the entity does not have any modifications, the request can be executed directly, and if there is a modification, a 412 precondition failed status code is returned, and the behavior action for the method is discarded (except for the Get method). Cache-control (Common head of http1.1)

Public
Only the body now responds to the header, notifying the browser that it can cache the response unconditionally.

Private
Only the body now responds to headers, informing the browser to cache responses only for individual users. And you can specify a field. such as private– "username"

No-cache
A) in the request header: Tell the browser to go back to the server to fetch the data and verify your cache (if any).
B In response header: Tell the browser to return to the server checksum, regardless of whether or not the data is cached. If you are sure that you have not changed, you can use the data in the cache

No-store
Tell the browser not to be cached under any circumstances.

Max-age
A) in the request header: Forces the response browser, which, based on this value, verifies the cache. That is, the age value of itself is compared to the requested time. If the max-age value is exceeded, the server-side validation is enforced. To ensure that a fresh response is returned. Its function is essentially similar to the traditional expires, but the difference
Pires is based on a specific date value. One but the cache's own time is inaccurate. The result could be a mistake. and Max-age, obviously there is no such problem.
The Max-age priority is also higher than the expires.
b) In response header: Ibid.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.