Talk about Web caching-strong cache, negotiated cache

Source: Internet
Author: User
Tags http 200 browser cache

There are many articles about Web caching on the Internet, a summary today.

Why do you use caching

Caching is generally used for static resources such as CSS,JS, images, etc. for the following reasons:

    • Faster requests: By caching content in a local browser or a cache server (such as a CDN) that is closest to you, you can significantly speed up site loading without compromising site interaction.

    • Bandwidth savings: For cached files, you can reduce the request bandwidth even without requesting a network.

    • Reduce server pressure: In the case of a large number of user concurrent requests, the performance of the server is limited, at this time, some static resources placed in the network of multiple nodes, can play a balanced load role, reduce the pressure on the server.

Cache classification

The cache is divided into service side (server side, such as Nginx, Apache) and client side (side, such as Web browser).
The common server-side cache has CDN cache, which refers to the browser cache.

Browser caching mechanism detailed cache type

The browser cache is divided into strong caches and negotiated caches :
1 Strong Cache : When the browser loads the resource, it determines whether it hits the strong cache based on some HTTP headers of the resource, and if the strong cache hits, the browser reads the resource directly from its own cache and does not send a request to the server. For example a CSS file, if the browser is loading its web page, the CSS file cache configuration hit a strong cache, the browser directly from the cache to load the CSS, even the request will not be sent to the Web server;
2 Negotiation Cache : When the strong cache is not hit, the browser must send a request to the server, through the server side according to the resources of the other HTTP header to verify whether the resource hit negotiate cache, if the negotiation cache hit, the server will return the request (304), But instead of returning the resource's data, it tells the client that the resource can be loaded directly from the cache, and then the browser will load the resource from its own cache, and if the request is missed, return the resource to the client and update the local cache data (200).

Strong caching differs from negotiating cache: The strong cache does not send requests to the server, and the Negotiate cache sends requests to the server.

How to set the cache

1 HTML meta tag Control cache (non-HTTP protocol definition)
<meta http-equiv= "Pragma" content= "No-cache" >
The purpose of the above code is to tell the browser that the current page is not cached, each access needs to go to the server pull. This approach is simple to use, but only some browsers can support it, and all cache proxies are not supported because the agent does not parse the HTML content itself.
2 HTTP header Information Control cache
HTTP header information control caching is implemented through expires (strong cache), Cache-control (strong cache), Last-modified/if-modified-since (negotiated cache), Etag/if-none-match (negotiated cache) , described in detail below.

1)Expires is the header of the http1.0 that represents the resource expiration time, which describes an absolute time, returned by the server, in GMT-formatted string representations, such as: Expires:thu, 2016 23:55:55 Gmt

读取缓存数据条件:缓存过期时间(服务器的)< 当前时间(客户端的

Disadvantage : Expires is the older strong cache management header, because it is the server returns an absolute time, so there is a problem, if the client time and the server time difference is very large (such as clock out of sync, or cross-time zone), then the error is very large, So starting with HTTP version 1.1, use cache-control:max-age= second instead.
2)Cache-control describes a relative time, in the cache hit time, are the use of client time to judge, so compared to Expires,cache-control cache management more efficient and secure some.

读取缓存数据条件:上次缓存时间(客户端的)+max-age < 当前时间(客户端的)

Cache-control values can be public, private, No-cache, No-store, No-transform, Must-revalidate, Proxy-revalidate, Max-age

各个消息中的指令含义如下:Public指示响应可被任何缓存区缓存。Private指示对于单个用户的整个或部分响应消息,不能被共享缓存处理。这允许服务器仅仅描述当前用户的部分响应消息,此响应消息对于其他用户的请求无效。no-cache指示请求或响应消息不能缓存,该选项并不是说可以设置”不缓存“,而是需要和服务器确认no-store在请求消息中发送将使得请求和响应消息都不使用缓存,完全不存下來。max-age指示客户机可以接收生存期不大于指定时间(以秒为单位)的响应。上次缓存时间(客户端的)+max-age(64200s)<客户端当前时间min-fresh指示客户机可以接收响应时间小于当前时间加上指定时间的响应。max-stale指示客户机可以接收超出超时期间的响应消息。如果指定max-stale消息的值,那么客户机可以接收超出超时期指定值之内的响应消息。

Note : These two headers can only be enabled for one or both, and when expires and Cache-control are present in the response header, the Cache-control priority is higher than expires:

3) Last-modified/if-modified-since:last-modified/if-modified-since to be used in conjunction with Cache-control.

last-modified: Indicates the last modification time for this response resource. When the Web server responds to a request, it tells the browser the last modification time of the resource.
if-modified-since: When a resource expires (a strong cache fails) and the discovery resource has a last-modified declaration, the request is made to the Web server again with a if-modified-since that represents the request time. When the Web server receives the request, it finds that the header if-modified-since is compared to the last modification time of the requested resource. If the last modification time is newer, indicating that the resource has been changed, then respond to the entire resource content (written in the response message packet), HTTP 200, if the last modification time is older, the resource has no new modifications, the response to HTTP 304 (no package, save browsing), tell the browser to continue to use the saved cache.
Disadvantages :

    • The last modification of the last-modified callout is only accurate to the second level, and if some files are modified multiple times within 1 seconds, it will not be able to accurately label the file's modification time (unable to update the file in time)

    • If some files are generated on a regular basis, and sometimes the content does not change, but the last-modified changes, causing the file to not use the cache, there may be situations where the server does not get the file modification time accurately, or is inconsistent with the proxy server time (cannot use the cache).

The ETag in HTTP1.1 solves the above problem.

4) Etag/if-none-match:etag/if-none-match should also be used in conjunction with Cache-control.
Etag: When the Web server responds to a request, it tells the browser that the current resource is uniquely identified on the server (the build rule is determined by the server). In Apache, the value of the ETag, by default, is obtained by hashing the file's index section (INode), size, and last modified time (MTime).
If-none-match: When a resource expires (using the Max-age identified by Cache-control) and the discovery resource has a etage claim, it is requested again with the top If-none-match (etag value) to the Web server. When the Web server receives the request, it finds that the header if-none-match is compared to the corresponding check string for the requested resource and decides to return 200 or 304.
An etag is a unique identifier on the server side of a server that is automatically generated or generated by the developer, allowing more accurate control of the cache. When Last-modified is used with the ETag, the server prioritizes the ETag.
Etag

2 Browser Request flowchart
Browser request flowchart for the first time

When the browser requests again

3. User behavior and caching

Browser caching behavior is also related to the behavior of the user, referring to the article browser HTTP protocol caching mechanism in a detailed conclusion

CDN Cache

The CDN cache belongs to one of the cache servers.
The full name of the CDN is the Content Delivery network, which is the contents distribution networks. The goal is to add a new layer of network architecture to the existing Internet, publish the content of the site to the "Edge" of the network closest to the user, so that users can get the content they need, solve the Internet congestion and improve the responsiveness of users to the website. From the technical comprehensive solution due to the network bandwidth is small, user access is large, dot distribution is not equal reason, to solve the user to visit the site of slow response speed of the root cause.

Through this, we can see that the access process of the website after using CDN cache is:
1), the user to the browser to provide the domain name to access;
2), the browser calls the domain name resolution library to resolve the domain name, because the CDN to the domain name resolution process has been adjusted, so the analytic function library generally obtains the domain name corresponding CNAME record, in order to obtain the actual IP address, the browser needs to parse the obtained CNAME domain name again to obtain the actual IP address In this process, the use of global load balancing DNS resolution, such as based on geo-location information to resolve the corresponding IP address, so that users can access the nearest.
3), this resolution obtains the IP address of the CDN cache server, the browser sends the access request to the cache server after obtaining the actual IP address;
4), if the request file is not modified, return 304 (acting as the role of the server). If the current file has expired, the cache server according to the browser provided by the domain name to access, through the cache internal private DNS resolution to obtain the actual IP address of this domain name, and then by the cache server to the actual IP address to submit access requests ;
5), the cache server from the actual IP address to obtain the content, on the one hand in the local storage, for later use, two aspects of the obtained data back to the client, complete the data service process;
6), the client obtains the data that is returned by the cache server and completes the entire browsing data request process.

Reference article:
1 Browser HTTP protocol caching mechanism
2 Principles of CDN Implementation
3 Introduction to the HTTP caching principle for back-end programmers

1190000006741200

Talking about Web caching-strong caching, negotiating caching

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.