Etag and resumable upload

Source: Internet
Author: User
Tags what header

Address: http://blog.chinaunix.net/uid-20614434-id-2999833.html

 

Author: finalbsd
Date: 2008-07-08
Previously, etag only provided the simplest description of resumable upload without further research. I studied for a while today, hoping to answer the question of Laurence, haha :)

1. Concept of resumable upload
Resumable upload can be divided into two parts: one is a breakpoint and the other is a resumable upload.

BreakpointThe reason is that during the download process, a downloaded file is divided into multiple parts, and multiple parts are downloaded at the same time. When the task is paused at a certain time point, in this case, the download is paused.

ResumeIt is better understood that when an unfinished download task starts again, it will continue to be transmitted from the previous breakpoint.

Note: The above short message comes from the description on the thunder website.

2. HTTP/1.1 and resumable upload
There are several items in the description of http1.1 that support resumable data transfer.

2.1 If-Range
If-range is the request header for determining another condition (we have discussed if-match/if-None-match, if-modified-since/if-unmodified-since ). the IF-range header is used to prevent the client from downloading a part of a resource (slice), and then re-downloading it again from the beginning. (For some slow networks, a complete file may not be downloaded for a lifetime ). After if-range is used, the client can continue to download the part from the previous download.

If-range format: If-range: etag | http-Date
That is to say, the value returned by etag or last-modified can be used after if-range:

If-range: "df6b0-b4a-3be1b5e1"
If-range: Tue, 8 Jul 2008 05:05:56 GMT

Logically, the above two methods work in the same way as if-match and if-unmodified-since. Their values are the etag and last-modified values returned by the server.

Range 2.2
Range is another request header sent by the client to the server. If not, if-range sent by the client is ignored. Similarly, if the server does not support resumable data transfer, if-range is ignored. In short, if-range does not make any sense when the range is left.

2.3 accept-ranges
Accept-ranges is a response header. The server sends this header to inform the client that it supports range requests. The value following this header can be bytes, and the client sends it back: ranges: bytes = 2400-indicates the data that requests 2400 bytes to the end of the file.

2.4 content-Range
This is a response header that describes the byte range of the content provided by the server and the length of the entire resource.

3. Working Method (this part is from the articles on the Network)
The following code shows some header information sent by IIS to the client to respond to an initial download request. It sends the detailed information of the requested document to the client.

HTTP/1.1 200 OK
Connection: Close
Date: Tue, 19 Oct 2004 15:11:23 GMT
Accept-ranges: bytes
Last-modified: Sun, 26 Sep 2004 15:52:45 GMT
Etag: "47febb2cfd76c41: 2062"
Cache-control: Private
Content-Type: Application/X-zip-compressed
Content-Length: 2844011

After receiving the header information, if the download is interrupted, ie will send the etag value and range header information back to the server in subsequent download requests. The following code shows some header information sent by IE to the server when trying to restore the interrupted download.

Get http: // 192.168.100.100/download.zip HTTP/1.0
Value Range: bytes = 822603-
Unless-modified-since: Sun, 26 Sep 2004 15:52:45 GMT
If-range: "47febb2cfd76c41: 2062"

Note that the IF-range element contains the original etag value that the server can use to identify the file to be resent. The unless-modified-since element contains the start date and time of the initial download. The server uses this information to determine whether the file has been modified since the initial download. If it has been modified, the server will download it again from the beginning. These header information indicates that IE caches the entity label provided by IIS and sends it back to the server in the IF-range header information, this is a way to ensure that the download is restored from the same accurate document. Unfortunately, not all browsers work in the same way. Other HTTP header information sent by the client for document verification may be if-match, if-unmodified-since, or unless-modified-since. Obviously, this specification does not specify what header information must be supported by the client software or what header information must be used. Therefore, some clients do not use header information at all, while ie only uses if-range and unless-modified-since. You 'd better check this information with code. In this way, your application can follow HTTP specifications at a high level and use multiple browsers. The range header specifies the requested byte range-in this example, it is the starting point for the server to restore the Document Stream.

When IIS receives a request for resuming download, it sends back the response containing the following header information:

HTTP/1.1 206 partial content
Content-range: bytes 822603-2844010/2844011
Accept-ranges: bytes
Last-modified: Sun, 26 Sep 2004 15:52:45 GMT
Etag: "47febb2cfd76c41: 2062"
Cache-control: Private
Content-Type: Application/X-zip-compressed
Content-Length: 2021408

Note that the above Code is slightly different from the HTTP Response of the initial download request-the request for resuming download is 206, and the request for initial download is 200. This indicates that the content transmitted through the line is part of the document. This content-range header indicates the exact number and position of transmitted bytes.

IE is very picky about the header information. If the initial response does not contain the etag header information, ie will never attempt to resume the download. Other clients I tested do not use the etag header information. They simply rely on the document name, request range, and last-modified header information (if they try to verify this document ).

Image reference: (thanks to I _amok)


Learn more about HTTP  

The header information displayed in the previous section is sufficient for the solution to resume download, but it does not fully cover the HTTP specification. In a single request, the range header information can ask for multiple ranges. This feature is called "multipart ranges )". Do not confuse it with segmented downloading. Almost any download tool uses multipart download to increase the download speed. These tools claim to increase the download speed by opening two or more concurrent connections (different scopes of each connection request document. The idea of multi-part range does not enable multiple connections, however, it enables the client software to request the first 10 and last 10 bytes of a document in a single request/response cycle.

Honestly, I have never found software snippets that use this feature. However, I refuse to write "he is not fully HTTP compatible" in the Code declaration ". Skipping this feature will definitely violate Murphy's Law ). In any case, the header information, plain text, and attachments are separated from each other in a multi-range or used for email transmission.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.