HTTP transfer encoding increases the amount of transmission, just to solve this problem

HTTP transfer encoding increases the amount of transmission, just to solve this problem | Useful HTTP

Last Update:2018-07-10 Source: Internet

Author: User

Tags rfc http 2

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Map: by @Olga

Hi, everyone, I am the Incense ink shadow!

The HTTP protocol occupies an important place in the network knowledge, the most basic of the HTTP protocol is the request and the response message, and the message is composed of the header and the entity. Most HTTP protocols are used in a way that relies on the Header of different HTTP request/Response settings.

This series of "practical http" put aside the conventional Header explanation of the way, from the actual problem, to analyze the use of these HTTP protocols, in the end is to solve what problem? At the same time explain how it is designed and its implementation principle.

The HTTP protocol is a stateless "loose protocol" that does not record the state of different requests, and because it itself contains both ends (client and server), which are differentiated according to the request and response, most of it is just a suggestion, in fact, the bilateral can not comply with this recommendation.

"Here's a suggested retail price of $2 ..."

"Oh, don't accept suggestions!" ”

In the first two articles, we talked about the HTTP caching mechanism and the HTTP Content entity encoding compression mechanism, and when it comes to entity encoding compression, it also mentions a transmission code that allows us to optimize the way we transmit. Entity encoding and transmission coding are mutually reinforcing, and we usually use them together.

This article will talk about the transmission encoding mechanism of HTTP.

Second, the transmission code of HTTP 2.1 What is the transmission code?

The transmission code is labeled in the message header of HTTP using Transfer-Encoding the header, which indicates the current transmission encoding being used.

Transfer-EncodingWill change the format of the message and the way it is transmitted, not only will it not reduce the size of the content transfer, it may even make the transmission larger, it seems to be an environmentally friendly approach, but in fact to solve some special problems.

In short, the transmission encoding must be used in conjunction with a persistent connection, in order to be in a persistent connection, the data chunked transmission, and mark the end of the transmission of the design, will be explained in detail later.

In the early years of design, as with content encoding used Accept-Encoding to mark the type of compression that the client receives, the transmission encoding needs to be used in conjunction with TE the request header to specify the supported transmission encoding. However, in the latest http/1.1 protocol specification, only one transmission encoding is defined: chunked encoding (chunked), so it is not necessary to rely on TE this head again.

These details will be covered in the following. Since transmission coding and persistent connections are closely related, let's start by understanding what a persistent connection is.

2.2 Persistent Connection (persistent Connection)

Persistent connection in general, is the long connection, English is called persistent Connection, in fact, according to the literal meaning of understanding is good.

In the early HTTP protocol, the order of transmitting data is roughly divided into the steps of initiating the request, establishing the connection, transmitting the data, closing the connection, and the persistent connection is the step of removing the close connection, so that the client and the server can continue to transfer the content through this connection.

This is in fact to improve transmission efficiency, we know that the HTTP protocol is based on the TCP protocol, the natural TCP-like three handshake, slow startup and other features, so every connection is actually a valuable resource. In order to improve the performance of HTTP as much as possible, it is important to use persistent connections. For this reason, the relevant mechanism is introduced in the HTTP protocol.

In the early http/1.0 protocol, there was no persistent connection, the concept of persistent connection was introduced at a later stage, which was Connection:Keep-Alive marked by this header to inform the client or the opposite end of the server, after sending the data, do not disconnect the TCP connection, and then need to use again.

In the http/1.1 protocol, it is important to discover the importance of a persistent connection, which specifies that all connections must be persistent unless explicitly in the header, through which the Connection:close connection is closed after the transfer is completed.

In fact, in the http/1.1 the Connect head has not Keep-Alive this value, due to historical reasons, many clients and service side, still retain the message header.

A long connection brings up another problem, how to determine the current data sent to completion.

2.3 Determining Transfer completion

In the early days when persistent connections were not supported, it was possible to rely on the disconnection to determine that the current transmission had ended, and that most browsers were doing the same thing, but this was not a canonical operation. This header should be used Content-Length to specify the length of the entity content for the current transfer.

For example, in the case of a persistent connection, the dependency is Content-Length to determine that the data has been sent.

Content-LengthHere a response entity has been sent to the end of the judgment basis. In this case, we need to Content-Length be consistent with the length of the content entity, and if not, there will be problems.

As shown, if it is Content-Length smaller than the length of the content entity, it is truncated, whereas the current response cannot be judged to be closed, causing the Padding state to persist the request.

Ideally, when we respond to a request, we need to know the size of its content entity. However, in practical applications, the length of the content entity is not readily available. For example, a content entity comes from a network file, or is dynamically generated. This time if you still want to get the length of the content entity in advance, can only open a buffer large enough, and so the content is all cached and then calculated.

But this is not a good plan, all cached in buffer, the first will consume more memory, the second will be more time-consuming, so that the client waits too long.

At this point, a new mechanism is needed to determine Content-Length whether the current content entity is transmitting or not, and it needs to be Transfer-Encoding judged by this header.

2.4 transfer-encoding:chunked

As mentioned earlier, Transfer-Encoding in the latest http/1.1 protocol, there is only chunked this parameter, which identifies the current chunked encoding transmission.

Chunked encoding transmission Since there is only one optional parameter, we just need to specify it as Transfer-Encoding:chunked , and later we can wrap the content entity into a block for transmission.

Rules for chunked transfers:

1. each chunk contains a 16-binary data length value and real data.

2. The data length value is exclusive to one row, and the real data is segmented by CRLF (\ r \ n).

3. The data length value, does not calculate the CRLF at the end of the real data, only calculates the data length of the current transfer block.

4. Finally, a block with a data length value of 0 is passed to mark the end of the current content entity transfer.

In this example, the first token in the response header Transfer-Encoding: chunked , followed by the first chunked "0123456780", the length of B (11 hexadecimal), then transmitted the "Hello Cxmydev" and "123", and finally a block length of 0 of the current response end.

2.5 chunked's Trailer

When we use chunked for chunked encoding transmission, after the transmission is over, there is a chance to append a piece of data at the end of the chunked message called trailer (Trailer).

Trailer data, can be the service side at the end of the data to be passed, the client can actually ignore and discard the trailer content, which requires both parties to negotiate the content of the transfer.

The included header fields can be included in the trailer, and other HTTP headers can be sent as trailers in addition to the transfer-encoding, Trailer, and content-length headers.

In general, we use trailers to pass some values that cannot be determined at the beginning of the response message, for example: The CONTENT-MD5 header is a common header that is appended to the trailer. As with length, it is difficult to figure out the MD5 value of a content entity that needs to be chunked encoded for transmission when it begins to respond.

Note that here in the head is added Trailder , to specify the end will also pass a CONTENT-MD5 trailer header, if there are multiple trailer data, you can use a comma to split.

Three, content encoding and transmission coding combination

Content encoding and transmission encoding are generally used in conjunction with each other. We will first use content encoding to compress the content entity and then send it through the transfer encoding block. The client receives the chunked data and then consolidates the data back into the original data.

Iv. Transmission Coding Summary

We should have a certain understanding of the transmission code. Here's a brief summary:

1. The transmission code is tagged with the transfer-encoding header, and in the latest http/1.1 protocol, it has only the chunked value, which represents the block encoding.

2. transmission coding is mainly to solve the persistent connection after the data chunked transmission, the decision content entity transmission end.

3. chunked Format: Data length (16 binary) + chunked data.

4. If additional data is available, you can use the Trailer trailer to transfer additional data after the end.

5. transfer encoding is typically used in conjunction with content encoding.

In addition, the transmission encoding should be the standard implementation of all http/1.1, should be supported, if you receive an incomprehensible transmitted encoded message, you should return directly to 501 unimplemented this status code to reply.

Reference connection:

Transfer-encoding:https://imququ.com/post/transfer-encoding-header-in-http.html in the HTTP protocol
REC 7230, 3.3.1 transfer-encoding:https://tools.ietf.org/html/rfc7230#page-28
RFC 7230, Section 4.4:trailer:https://tools.ietf.org/html/rfc7230#section-4.4
RFC 7230, section 4.1.2:chunked trailer part:https://tools.ietf.org/html/rfc7230#section-4.1.2

Public number back to growth " growth ", will be prepared by my study materials, can also reply to " Dabigatran ", learning progress together, you can also reply to " ask questions " and ask me questions.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More