HttpClient accept-encoding garbled

Source: Internet
Author: User

Workaround
1Httpentity httpentity =httpresponse.getentity ();2             if(Httpentity! =NULL) {3                 if(httpentity.getcontentencoding ()! =NULL) {4                     if("gzip". Equalsignorecase (Httpentity.getcontentencoding (). GetValue ())) {5Httpentity =Newgzipdecompressingentity (httpentity);6}Else if("Deflate". Equalsignorecase (Httpentity.getcontentencoding (). GetValue ())) {7Httpentity =Newdeflatedecompressingentity (httpentity);8                     }9                 }TenHtmlbyte =Entityutils.tobytearray (httpentity); One}

From: https://www.imququ.com/post/vary-header-in-http.html

Often catch the packet look at the HTTP request of the classmate should be Vary this response header field is not unfamiliar, what is the use? When checking pages with the Pagespeed tool, you often see "specify a vary:accept-encoding header (please specify a vary:accept-encoding header)" So why do you do this? " This article records some of my research on Vary, which contains answers to these questions.

HTTP Content Negotiation

To understand the role of Vary, first understand the HTTP content negotiation mechanism. Sometimes, the same URL can provide many different documents, which requires a server and client to choose the most appropriate version of the mechanism, which is the content negotiation.

There are two types of negotiation, one is the server to send a list of available versions of the document to the client for the user to select, which can be achieved using the multiple Choices status code. There are a number of problems with this scheme, first a network round-trip, and then some versions of the same document on the server may be prepared for clients with some technical characteristics, while ordinary users do not necessarily understand these details. For example, the server can usually output static resources as compressed and uncompressed two versions, the compressed version is obviously to support the compression of the client is prepared, but if the ordinary user select, it is likely to choose the wrong version.

So HTTP content negotiation typically uses a different scenario: The server automatically sends the most appropriate version based on certain fields in the request header sent by the client. There are two types of request header fields that can be used for this mechanism: Content negotiation private fields (Accept fields), and other fields.

First look at the Accept field, as described in the following table:

Request Header Field Description Response Header Field
Accept Tell the server which media type to send Content-type
Accept-language Tell the server which language to send Content-language
Accept-charset Tell the server which character set to send Content-type
Accept-encoding Tell the server which compression method to use Content-encoding

For example, the client sends the following request headers:

Accept:*/*Accept-Encoding:gzip,deflate,sdchAccept-Language:zh-CN,en-US;q=0.8,en;q=0.6

Indicates that it can accept any MIME-type resources, supports compressed resources with gzip, deflate, or SDCH, accepts ZH-CN, en-us, and en three languages, and ZH-CN has the highest weight (q value 0-1, up to 1, and a minimum of 0, the default is 1), the server should first return the language equals ZH-CN version.

The browser's response header might be something like this:

Content-Type: text/javascriptContent-Encoding: gzip

The exact MIME type of this document is text/javascript; The document content is gzip compressed; The response header does not have a content-language field, which usually means that the returned version of the language is exactly the one with the highest weight in the request header accept-language.

Sometimes, the above four Accept fields are not enough, for example, to output a different content for a particular browser such as IE6, you need to use the User-agent field in the request header. Similarly, cookies in the request header may also be used by the server as a basis for outputting differentiated content.

Because there may be one or more intermediate entities (such as cache servers) between the client and the server, the most basic requirement for the caching service is to return the correct document to the user. If the service side returns different content based on different user-agent, and the cache server caches the response of the IE6 user and returns it to the user using another browser, there is definitely a problem.

So the HTTP protocol specifies that if the content provided by the server depends on the request header field of user-agent such as the "Regular Accept negotiation field", the Vary field must be included in the response header, and the contents of Vary must contain user-agent. Similarly, if the server uses both User-agent and Cookie fields in the request header to generate the content, the Vary field in the response should look like this:

Vary: User-Agent, Cookie

That is, the Vary field lists a list of response fields and tells the cache server how to cache and filter the appropriate versions when it encounters the same URL for a different version of the document.

BUG-Aware Caching service

Then see Pagespeed "specify a vary:accept-encoding header" this hint, according to the above instructions, accept-encoding belongs to the content negotiation dedicated field, the server only need to add in the response header The Content-encoding field is used to indicate the content compression format, or the output content-encoding indicates that the content is not compressed. Caching servers, however, should cache different content for different content-encoding, and then return the most appropriate version based on the Accept-encoding field in the specific request.

However, some cache servers that implement bugs will ignore the content-encoding in the response header, potentially returning a compressed version of the cache to clients that do not support compression. There are two scenarios to avoid this situation:

    1. Set the Cache-control field in the response header to private, telling the intermediate entity not to cache it;
    2. Add vary:accept-encoding response header, explicitly tell the cache server to cache different versions according to the contents of the Accept-encoding field;

We usually use the second scenario in order to better utilize the caching capabilities of the intermediate entities.

For static resources such as CSS and JS, the server should always enable it as long as the client supports gzip, and the vary:accept-encoding should be output in order to avoid a bug in the cache servers that return the wrong version to the user.

Nginx and SPDY

Usually, the above mentioned work, the Web Server can help us to take care of. For Nginx, the following configuration can automatically add vary:accept-encoding to the gzip-enabled response:

gzip_vary on;

Using Curl to verify my blog's js file, the response header is as follows:

[email protected]:~$ curl --headhttps://www.imququ.com/.../xx.jsHTTP/1.1 200 OKServer: nginxDate: Tue, 31 Dec 2013 16:34:48 GMTContent-Type: application/x-javascriptContent-Length: 66748Last-Modified: Tue, 31 Dec 2013 14:30:52 GMTConnection: keep-aliveVary: Accept-EncodingETag: "52c2d51c-104bc"Expires: Fri, 29 Dec 2023 16:34:48 GMTCache-Control: max-age=315360000Strict-Transport-Security: max-age=31536000Accept-Ranges: bytes

Can see, the server correctly output the "vary:accept-encoding", all normal.

But with Chrome's grab-and-go tool, the response header looks like this:

http /1 .1 OK cache-control:max-age=315360000 content-encoding:  gzip content- type : Application /x-javascript date expires:fri, Dec 2023 16:35:27 GMT last-modified:tue, Dec 14:30:52 GMT server:nginx status:200 strict-transport-security:max-age=31536000 version:http /1 .1

My blog supports the SPDY/2 protocol, and using Chrome to visit my blog will go SPDY, so the response header above is a bit unusual, for example, the field name becomes lowercase, and the status, version, and other fields, these changes the next special introduction (note: See "spdy 3.1 in the request/ Response header "). The magic is that the vary:accept-encoding in the response is gone, even though the service side has not changed.

SPDY Specifies that the client must support compression, which means that the SPDY server can directly enable compression without caring for the accept-encoding field in the request header. The following is a SPDY/2 protocol from Nginx support:

User-agents is expected to support gzip and deflate compression. Regardless of the accept-encoding sent by the user-agent, the server could select Gzip or deflate Encoding at any time. [via]

Therefore, for the support SPDY client, vary:accept-encoding no purpose, Nginx Select directly remove it, you can save a bit of traffic. Curl or other clients that do not support the SPDY protocol still go through the HTTP protocol, so the response headers that are seen are regular.

The appropriateness of Nginx's approach has always been controversial, and not all WEB servers that support SPDY will do so. For example, even through the SPDY protocol to access the Google home page JS file, you can still see vary:accept-encoding:

HTTP/1.1 200 OKstatus: 200 OKversion: HTTP/1.1age: 25762alternate-protocol: 443:quiccache-control: public, max-age=31536000content-encoding: gzipcontent-length: 154614content-type: text/javascript; charset=UTF-8date: Tue, 31 Dec 2013 23:23:51 GMTexpires: Wed, 31 Dec 2014 23:23:51 GMTlast-modified: Mon, 16 Dec 2013 21:54:35 GMTserver: sffevary: Accept-Encodingx-content-type-options: nosniffx-xss-protection: 1; mode=block

In addition, both Chrome and Firefox support the SPDY protocol at this stage, but Pagespeed Chrome and Firefox do not specifically deal with the SPDY protocol, so test my blog with them, or you'll be prompted "specify a Vary:ac Cept-encoding header", this kind of makes people laugh and cry. However, the Pagespeed online version has updated the rules, and the extended version is estimated to be fast.

Ps:vary in IE have a lot of pits, use with extreme caution. Online this part of the article is more, for example, Hax early written IE and Vary head, you can point past to understand under.

HttpClient accept-encoding garbled

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.