How does HTTP determine the file size?

Source: Internet
Author: User

How can we determine the file length of an HTTP object? There are roughly four situations:
* Any message that does not contain a message body (for example, a response message of 1xxx, 204, or 304, and a response message of any header (Head, header) Request) is always composed of a blank line (clrf) end.

* If the message uses the media type "multipart/byteranges" and the transfer-length is not specified, the self-delimiting media type defines the transfer-length. This type cannot be used unless the sender knows that the receiver can parse this type.

* The server closes the connection to determine the message length. (Note: closing the connection cannot be used to determine the end Of the request message, because the server cannot send a response message to the client again .)

* If the Content-Length header field appears, its value indicates entity-length (entity length) and transfer-length (Transfer length ). If the two lengths are different (I. e. Sets the transfer-encoding header field), the Content-Length header field cannot be sent. If both the transfer-encoding field and the Content-Length header are received, the Content-Length field must be ignored.

* If the transfer-encoding header field is displayed and its value is not "Identity", the transfer-length is defined by the "chunked" Transmission Encoding unless the message is terminated because the connection is closed.

 

Situations 1 and 2 are not described here. The following three types are described in sequence:

 

Use TCP connection disconnection to get the file length:
In the HTTP/1.0 era, the HTTP server usually uses the short link mode, a GET request, a TCP connection, data transmission is complete, and the server automatically shutdown the connection, the client's read can naturally be returned (-1 in Linux and 0 in Windows), so that you naturally know the margin and length of the file. However, this is not used in the HTTP/1.1 era. During the active period of the client or a period of free time, the server will not cut off the TCP connection, it is transmitted through the same TCP connection, which means that the client cannot determine whether an HTTP file is transmitted successfully by returning-1 through read. After a file is transmitted, if you read it again, it will only be blocked there. At this time, the customer must determine the file boundary. Of course, there is still a problem in determining the file length based on TCP disconnection, that is, once the network condition is poor,

Content-Length:
For static pages, images or other files that can be determined by size, the HTTP Server adds Content-Length to the HTTP header and writes the file size. The client can obtain the size by reading the file header. However, most websites are based on dynamic technologies. The size of webpages, especially the size of complex pages, makes it difficult for the server to know immediately. If you wait until the page is fully generated and then sent, the latency increases, this has an impact on the customer experience and server throughput, which requires the following chunk technology.

chunked encoding:
the server uses the chunked technology to generate a part of the sent part. For example, you can first send the static frame part, send dynamic content again. To put it simply, chunked encoding refers to dividing the entire file into blocks, adding the block size before each block, and finally ending with an empty one (not really no content ).
the chunk encoding format is as follows:
chunked-Body = * chunk
"0" CRLF
footer
CRLF
chunk = chunk-size [chunk-ext] CRLF
chunk-data CRLF
here * chunk represents 0-any chunk block, each chunk block consists of two parts. The first part includes the block size, block unit (usually not filled), line feed (0x0d0a), and the second part includes the block data, and then ends with a blank row.
finally, it ends with an empty block with the block size of "0". In fact, the empty block contains footer, Which is metadata and can be ignored. The end is also a blank line.

The following is Wireshark's Data Filtering for a website:

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.