Content-encoding and Transfer-encoding in the HTTP protocol (Content encoding and transfer encoding)

Source: Internet
Author: User

Transferred from: http://network.51cto.com/art/201509/491335.htm

Transfer-encoding, is an HTTP header field that literally means " transfer encoding ". In fact, there is another header in the HTTP protocol related to encoding: content-encoding ( content encoding ). content-encoding is typically used to compress entity content to optimize transmission, such as compressing text files with gzip, which can significantly reduce volume . Content encoding is usually optional, such as jpg/png files generally do not open, because the picture format is already highly compressed, and then press again no effect does not say also wasted CPU.

and transfer-encoding is used to change the message format , it not only will not reduce the size of the physical content transmission, and even make the transmission become larger , then what is its role? This is the main thing to be said. Let's just remember that content-encoding and transfer-encoding are mutually reinforcing, and for an HTTP message, content encoding and transmission encoding are likely to be done at the same time.

Persistent Connection

Put transfer-encoding aside for the time being, let's look at another important concept in the HTTP protocol: Persistent Connection (persistent connection, popular parlance long connection). We know that HTTP is running on top of a TCP connection and naturally has the same three-time handshake and slow-start features as TCP, so it is important to use persistent connections to maximize HTTP performance. For this reason, the HTTP protocol introduces the appropriate mechanism.

Http/1.0 's persistent connection mechanism was introduced later, through the connection:keep-alive of the head, the server and the client can use it to tell the other side after sending the data does not need to disconnect the TCP connection, for later use. http/1.1 Specifies that all connections must be persistent unless the connection:close is explicitly addedto the head. So in fact, http/1.1 in Connection this header field has not keep-alive this value, but for historical reasons, many Web Server and browser, still reserved for http/1.1 long connection sent connection:keep- The habit of alive.

The browser reuses the idle persistent connection that has already been opened, avoids the slow three-time handshake, and avoids the congestion adaptation phase of the TCP slow start, which sounds wonderful. In order to delve into the features of persistent connections, I decided to write a simplest Web server using node for testing, and node provided the HTTP module for quick creation of HTTP WEB server, but I needed more control, so I created a TCP Serve with the net module. R:

Jsrequire (' net '). Createserver (function (sock) {Sock.on (' data ', function (data) {sock.write (' http/1.1 ' ok\r\n '); Sock.write (' \ r \ n '); Sock.write (' Hello world! '); Sock.destroy ();}). Listen (9090, ' 127.0.0.1 ');

After starting the service, access the 127.0.0.1:9090 in the browser, correctly output the specified content, everything is OK. Remove the Sock.destroy () line, let it become a persistent connection, restart the service and then visit. This time the result is a bit strange: slow to see the output, through the Network to view the status of the request, has been pending.

This is because, for non-persistent connections, the browser can define the boundary of the request or response entity by whether the connection is closed, and for persistent connections, this method is obviously ineffective . In the example above, although I have already sent all the data, but the browser does not know this, it can not tell whether the open connection will still have new data come in, can only be silly to wait .

Content-length

The easiest way to solve the above problem is to calculate the length of the entity and tell the person through the head. This is going to use the content-length, change the above example:

Jsrequire (' net '). Createserver (function (sock) {Sock.on (' data ', function (data) {sock.write (' http/1.1 ' ok\r\n '); Sock.write (' content-length:12\r\n '); Sock.write (' \ r \ n '); Sock.write (' Hello world! ');}). Listen (9090, ' 127.0.0.1 ');

As you can see, this time the data is sent without closing the TCP connection, but the browser can output the content and end the request, because the browser can determine the end of the response entity by content-length the length information. What if the content-length is inconsistent with the actual length of the entity? Interested students can try it themselves, usually if the content-length is shorter than the actual length, it will cause the content to be truncated, if it is longer than the entity content, it will cause pending.

Because the Content-length field must truly reflect the length of the entity, in practice there are times when the entity length is not so good, such as when the entity comes from a network file or is generated by a dynamic language. At this time to accurately obtain the length, can only open a large enough buffer, and so on all the content generated good recalculation. But doing so requires a lot more memory overhead and, on the other hand, makes the client wait longer.

When we are doing WEB performance optimization, there is an important indicator called TTFB (time to First byte), which represents the amount of times it takes to make a request from the client to the initial byte of the received response. Most browser-included Network panels can see this indicator, the shorter the TTFB means that the sooner the user can see the page content, the better the experience. It is conceivable that the server caches everything to calculate the response entity length and runs counter to the shorter TTFB concept. But in the HTTP message, the entity must be behind the head, the order can not be reversed, for this we need a new mechanism: do not rely on the length of the head information, can also know the boundary of the entity.

Transfer-encoding:chunked

The protagonist finally appeared again, Transfer-encoding is to solve the above problem. Historically, transfer-encoding can have multiple values, and a header named TE was introduced to negotiate which transmission encoding to use. However, in the latest HTTP specification, only one encoding transmission is defined: chunked encoding (chunked).

Block coding is fairly simple, and after adding transfer-encoding:chunked to the head, it represents this message using chunked coding. At this point, the entity in the message needs to be transferred by a series of blocks. Each tile contains 16 binary length values and data, the length value is exclusive to one row, the length does not include the CRLF at its end (\ r \ n), or the CRLF at the end of the chunked data. The last chunk length value must be 0, the corresponding chunked data has no content, indicating the end of the entity. Change the previous code according to this format:

Jsrequire (' net '). Createserver (function (sock) {Sock.on (' data ', function (data) {sock.write (' http/1.1 ' ok\r\n '); Sock.write (' transfer-encoding:chunked\r\n '); Sock.write (' \ r \ n ');  Sock.write (' b\r\n '); b Here is the 16 binary, which represents the decimal 11sock.write (' 01234567890\r\n '); Sock.write (' 5\r\n '); Sock.write (' 12345\r\n '); Sock.write (' 0\r\n '); Sock.write (' \ r \ n ');}). Listen (9090, ' 127.0.0.1 ');

In the above example, I indicated in the response header that the next entity would use chunked encoding, then output 11 bytes of content, then output 5 bytes of content, and finally use a 0-length block to indicate that the data has been passed. You can get the correct results by accessing this service in a browser. As you can see, this simple chunking strategy is a good solution to the problem raised earlier.

Previously said content-encoding and transfer-encoding are often combined to use, in fact, the transfer-encoding for the sub-block and then content-encoding. Here's what I did with my Telnet request for the response to the test page, and I gzip encode the chunked content:

shell> telnet 106.187.88.156 80get/test.php http/1.1host:qgy18.imququ.comaccept-encoding:gziphttp/1.1 200 OKServer:nginxDate:Sun, 17:25:23 gmtcontent-type:text/htmltransfer-encoding:chunkedconnection: Keep-alivecontent-encoding:gzip1f? H??? W (?/? I? J0

Use HTTP to grab the package artifact Fiddler can also see similar results, interested students can try their own.

============================

Note:

1) Determine whether the server-side configuration gzip compression is effective, as long as the response to the head of the Content-encoding:gzip;

2) The significance of transfer-encoding transmission coding.

Content-encoding and Transfer-encoding in the HTTP protocol (Content encoding and transfer encoding)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.