[Http protocol learning] [reprint] etag

Source: Internet
Author: User
Etag

Text/finalbsd
We all know that there is an etag in HTTP/1.1 to determine whether the requested file has been modified.
Why use etag? Etag is mainly used to solve some problems that last-modified cannot solve.
1. Some files may be changed cyclically, but their content does not change (only the modification time). At this time, we do not want the client to think that the file has been modified, and get again;
2. Some files are frequently modified, for example, modified within seconds (for example, modified N times within 1 s ), if-modified-since can check that the granularity is s-level, which cannot be determined (or the Unix record mtime can only be accurate to seconds)
3. Some servers cannot accurately obtain the last file modification time;

Therefore, the etag (Entity
Tags). etag is only a file-related tag. It can be a version tag, such as v1.0.0 or "2e681a-6-5d044840 ".
Mysterious encoding. However, the HTTP/1.1 Standard does not specify what the etag content is or how to implement it. The only rule is that the etag needs to be placed in.

The etag is generated by the server. The client checks whether the request modifies the resource through the condition of IF-match or if-None-match. We usually use if-None-match. To request a file, the process may be as follows:
==== First request ====
1. The client initiates an http get request for a file;
2. The server processes the request, returns the file content and a bunch of headers, including etag (for example, "2e681a-6-5d044840") (assuming that the server supports etag generation and etag has been enabled). Status Code 200

=== Second request ===
1. The client initiates an http get request for a file. Note that the client sends an IF-None-match header at the same time. The content of this header is the etag: 2e681a-6-5d044840 returned by the server during the first request.
2. The server determines that the sent etag matches the calculated etag. Therefore, if-None-match is false, 200 is not returned, and 304 is returned. The client continues to use the local cache;

The process is very simple. The problem is, what if the server has set cache-control: Max-age and expires?
The answer is to use it at the same time. That is to say, if-modified-since and if-None-match are completely matched. That is, after checking the modification time and etag, the server can return 304. (do not fall into the strange circle of who is using it)

Let's take a look at the etag implementation in Apache.
1. Apache first checks whether etag is weak. If not, go to the second scenario:

Strong etagSet the etag value according to the configuration in the configuration file. The default fileetag of Apache is:
Fileetag inode mtime size
That is, the etag value is generated based on these three attributes.AlgorithmAnd output to the hex format. Adjacent attributes are separated by-. For example:
Etag "2e681a-6-5d044840"
The three sections here represent the hex format of inode, mtime, and size values calculated based on the algorithm, (if you see a non-hex character (that is, 0-f) Here, you may see it :))

Of course, we can change the fileetag settings of Apache, for example, to fileetag size, the resulting etag may be:
Etag "6"
In short, if several segments are set, the etag value has several segments. (Do not mistakenly assume that etag is a fixed 3-segment)

Description
Apache
The etag implementation in 2.2 is because HTTP/1.1 does not specify the implementation or format of etag. Therefore, you can modify or compile your own algorithm to obtain
Etag, for example, "2e681a65d044840", the client will remember and cache this etag (where is it stored in windows, I haven't found it yet :(),
This value is directly used for the next access to compare with the etag generated by the server.

Note:
No matter what the algorithm is, the server needs to perform computation, resulting in overhead and performance loss. For this reason, many websites have completely disabled etag (such as Yahoo !), This does not actually comply with HTTP/1.1, because HTTP/1.1 always encourages the server to enable etag as much as possible.

Weak etag)
Reconsider the three questions mentioned above:
Question 1Some files may be changed cyclically, but their content does not change (only the modification time). At this time, we do not want the client to think that the file has been modified, but to get it again;

Solution: If you use strong etag, you must re-get the page each time. If you use etag, for example, set it to fileetag.
You can ignore the last-modified time modification caused by mtime, thus affecting the IF-modified-since (IMS) verification. This and
Weak etag is irrelevant.

Question 2,Some files are frequently modified, for example, modified within seconds (for example, modified N times within 1 s ), if-modified-since can check that the granularity is s-level, which cannot be determined (or the Unix record mtime can only be accurate to seconds)

Solution: In this case, Apache will automatically determine the difference between the request time and the modification time. If it is less than 1 s, Apache will think that this file can be
Can be modified again, so a weak etag (weak
Etag), this etag is generated only based on mtime, so mtime can only be accurate to S, so the etag generated within 1 s is always the same, which avoids the use of strong etag.
Cache refresh frequently within 1 s. (It seems that etag is not used, and only last-modified can be used to solve the problem. However, this is only applicable to situations where changes are extremely frequent.
Files may also use strong etag verification ). Weak etag starts with w/, for example, W/"2e681a"

Question 3,Some servers cannot accurately obtain the last file modification time;

Solution:Generate etag because etag can combine inode, mtime, and size to avoid this problem.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.