Transfer-Encoding in HTTP

Source: Internet
Author: User

Transfer-Encoding in HTTP

Transfer-Encoding is an HTTP header field, which literally means "Transfer Encoding 」. In fact, there is another header in the HTTP protocol that is related to Encoding: Content-Encoding (Content Encoding ). Content-Encoding is usually used to compress and encode the object Content to optimize transmission. For example, compressing text files with gzip can greatly reduce the size. Content Encoding is usually optional. For example, jpg/png files are not enabled, because the image format is already highly compressed. If you press it again, the effect is not guaranteed.

Transfer-Encoding is used to change the message format. It not only reduces the transmission size of entity content, but also increases transmission. What is its role? This article will focus on this. Remember that Content-Encoding and Transfer-Encoding complement each other. For an HTTP packet, it is likely that both Content Encoding and Transmission Encoding are performed.

 

Persistent Connection

Temporarily put Transfer-Encoding aside. Let's take a look at another important concept in HTTP: Persistent Connection (Persistent Connection ). We know that HTTP runs on a TCP connection and naturally has the same three-way handshake and slow start as TCP. To improve HTTP performance as much as possible, it is particularly important to use persistent connections. Therefore, the HTTP Protocol introduces a mechanism.

The persistent Connection mechanism of HTTP/1.0 was introduced later. It is implemented through the Connection: keep-alive header, both the server and the client can use it to tell the other party that TCP connections do not need to be closed after data is sent for future use. HTTP/1.1 requires that all connections must be persistent unless Connection: close is explicitly added to the header. Therefore, the Connection header field in HTTP/1.1 does not have the keep-alive value. However, due to historical reasons, many Web servers and browsers, still keep the habit of sending Connection: keep-alive to HTTP/1.1 persistent connections.

The browser can reuse idle persistent connections that have been opened to avoid slow three-way handshakes and avoid TCP Slow startup congestion adaptation. This sounds wonderful. In order to thoroughly study the persistent connection feature, I decided to use Node to write a simple Web Server for testing. Node provides the http module to quickly create an HTTP Web Server, but I need more control, so I used the net module to create a TCP Server:

JSrequire('net').createServer(function(sock) {sock.on('data', function(data) {sock.write('HTTP/1.1 200 OK\r\n');sock.write('\r\n');sock.write('hello world!');sock.destroy();});}).listen(9090, '127.0.0.1');

After the service is started, access 127.0.0.1: 9090 in the browser. The specified content is correctly output, and everything is normal. Remove sock. destroy () and change it to a persistent connection. Restart the service and try again. The result is a bit strange: the output is not visible for a long time, and the Request status is always pending through the Network.

This is because, for non-persistent connections, the browser can define the boundary of the request or response entity through whether the connection is closed. For persistent connections, this method obviously does not work. In the above example, although I have already sent all the data, the browser does not know this. It cannot know whether new data will come in on the opened connection. It can only be silly to wait.

Content-Length

To solve this problem, the easiest way to think of it is to calculate the object length and tell the other party through the header. This requires Content-Length. Let's rebuild the above example:

JSrequire('net').createServer(function(sock) {sock.on('data', function(data) {sock.write('HTTP/1.1 200 OK\r\n');sock.write('Content-Length: 12\r\n');sock.write('\r\n');sock.write('hello world!');});}).listen(9090, '127.0.0.1');

We can see that the TCP connection is not closed after the data is sent this time, but the browser can normally output the Content and end the request, because the browser can use the Content-Length information, it is determined that the response entity has ended. What if the actual Content-Length and object Length are inconsistent? If you are interested, you can try it by yourself. Normally, if Content-Length is shorter than the actual Length, the Content will be truncated. If Content is longer than the actual Content, it will cause pending.

Because the Content-Length field must actually reflect the object Length, in practice, sometimes the object Length is not so good, for example, the object is from a network file or generated by a dynamic language. At this time, to obtain the length accurately, you can only open a buffer that is large enough, and wait until all the content is generated before calculation. However, this requires a higher memory overhead and a longer client.

When we optimize WEB performance, an important indicator is TTFB (Time To First Byte ), it indicates the time spent from the client sending a request to the first byte of the response received. Most of the Network panels provided by browsers can see this indicator. The shorter the TTFB, the sooner the user can see the page content, the better the experience. It is conceivable that the Server caches all content to calculate the response object length, which is contrary to the shorter TTFB concept. However, in the HTTP message, the object must be behind the header and the sequence cannot be reversed. Therefore, we need a new mechanism: the object boundary can be known without relying on the length information of the header.

Transfer-Encoding: chunked

The main character of this article has finally appeared again. Transfer-Encoding is used to solve the problem above. In history, Transfer-Encoding can have multiple values. Therefore, a header named TE is introduced to negotiate which Transmission Encoding is used. However, in the latest HTTP specification, only one encoding transmission is defined: chunked ).

Block Encoding is quite simple. After Transfer-Encoding: chunked is added to the header, this message uses block Encoding. In this case, the entity in the message needs to be transmitted in a series of blocks. Each part contains the hexadecimal Length Value and data. The length value excludes one row and does not include the CRLF (\ r \ n) at the end of the part. The length of the last part must be 0. The corresponding part data has no content, indicating that the object ends. Follow this format to modify the previous Code:

JSrequire('net').createServer(function(sock) {sock.on('data', function(data) {sock.write('HTTP/1.1 200 OK\r\n');sock.write('Transfer-Encoding: chunked\r\n');sock.write('\r\n'); sock.write('b\r\n');sock.write('01234567890\r\n');sock.write('5\r\n');sock.write('12345\r\n');sock.write('0\r\n');sock.write('\r\n');});}).listen(9090, '127.0.0.1');

In the above example, I indicated in the response header that the next entity will adopt block encoding, then the 11-byte content is output, and then the 5-byte content is output, finally, a zero-length block is used to indicate that the data has been uploaded. Use a browser to access this service and get the correct result. We can see that this simple block policy can effectively solve the problems mentioned above.

As mentioned above, Content-Encoding and Transfer-Encoding are often used in combination. In fact, Content-Encoding is performed on the parts of Transfer-Encoding. The following is the response I got from the telnet request test page, and gzip code is performed on the part content:

SHELL> telnet 106.187.88.156 80GET /test.php HTTP/1.1Host: qgy18.imququ.comAccept-Encoding: gzipHTTP/1.1 200 OKServer: nginxDate: Sun, 03 May 2015 17:25:23 GMTContent-Type: text/htmlTransfer-Encoding: chunkedConnection: keep-aliveContent-Encoding: gzip1f�H���W(�/�I�J0

You can also see similar results using the HTTP packet capture artifact Fiddler. If you are interested, you can try it on your own.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.