Two entity headers (Entity-header) in HTTP 1.1 are directly related to encodings, respectively, content-encoding and transfer-encoding.
First say content-encoding, which indicates that the entity has adopted the encoding method. Content-encoding is part of the request URL counterpart entity itself. For example, when the request URL is http://host/image.png.gz, The values that may be obtained for content-encoding to gzip.content-encoding are case-insensitive, and the current HTTP1.1 standard includes gzip/compress/deflate/identity and so on.
corresponding to the content-encoding header, the HTTP request contains a accept-encoding header that describes what type of encoding the user agent (user-agent, typically the browser) can accept. If the header does not exist in the HTTP request, the server can assume that the user agent can accept any encoding type.
The next step is to describe the transfer-encoding, which represents the encoding of the entity for the purpose of secure transmission or data compression. The difference between transfer-encoding and content-encoding is that:
1, transfer-encoding is only in the transmission process, not the request URL corresponding to the entity's own characteristics.
2, Transfer-encoding is a "jump to jump" head, while Content-encoding is the "end-to-end" header.
The purpose of the header for example, the request URL is http://host/abc.txt, the server sends the data that the file can be compressed to save bandwidth by Gzip, and the receiving side sees that transfer-encoding is first decoded for Gzip before it gets the requested entity.
In addition, multiple encodings may be used for the same entity at the same time, so the encoding order in the transfer-encoding header is important, and it represents the sequential process of decoding. Similarly, the value of transfer-encoding is case-insensitive, The current HTTP1.1 standard has included gzip/compress/deflate/identity/chunked and so on.
Examples of messages:
Server:apache-coyote/1.1cache-control:no-storepragma:no-cacheexpires:thu, 1970 00:00:00 Gmtcontent-type:text /html;charset=gbktransfer-encoding:chunkedcontent-encoding:gzipvary:accept-encodingdate:mon, 2013 02:37:55 GMTa ..... 200.=ks ..... U.. V.. F...... (.. L....LCL. 5. #5. T.. {$d: /x[[email protected]<,.$. ^.. N.. 7...q...v% ... G.... f~. T.. P?. =............?. -...... J.tu-.. Z.i.m[......h[... 3K.. 1..u.k\3.k...<, ..... OO....O.^......V). # ..... (c...b. (....... 3.I .... R ' ... *...%o ... 9 ... (c .....) 5.....V ... 4......NW.. . m./...]....... L.. Z.x=*.>.$= .... {g7y .... [F. (... M.........E ..... Nh '. uu.n.....| ZE....,=.>L.JZ...V. Y$5....ho.c .... NB.....\M.P. [J ... A. I .... 6..rsl.q ... 6>.h.] Y.... J.1.F...E......&Z....W...P ... P.. ^.z+. H.. Sms.. I...q.m.ts ..... (.. K.... u.0>.k200. d) M19 ...} -. {.... I.~mui ... n....+k...j#. Qdq.....x.7mai3. K.. Z .......) 4...^...= ... B.. ~(...] ... S........>=]9 ' ... c:....| F+k........^.hiugd.x.t.sy. Ba...v ..... O.. S....F.P ... IY;. OI .....Fd... 3.q....e..........dl ... T.. m.< ' Z ... Kf ... " PR ..... Y6. +.f. E.. LW&.M. T... Vt.. 1 ...]. '.. 3.Z ... '. RI5. J..;.:... J..:.~...>I.V\.V. wum....am. V...&c+....<sf.f| ..... I ... Q.q.3 ..... U.. F... O.....!. R.E ..... x...k.....z.tf.xz....$.>) R.2. 6 ... f ...... kp7p ... 92.c.. E......&. [. &ys.p.s ..... 4.DN....P.^[EMAIL&NBSP;PROTECTED]{T7.MF. jut.200
I. Introduction to the Meaning of transfer-encoding
Sometimes, the Web server generates HTTP response is unable to determine the size of the message in the header, in general, the server will not provide Content-length header information, and chunked encoding dynamically provide the length of the body content.
The HTTP response for chunked encoded transmission is set at the message header:
Transfer-encoding:chunked
Indicates that the content body will transmit the contents using chunked encoding.
The chunked code is concatenated with several chunk, ending with a chunk indicating a length of 0 . Each chunk is divided into the head and the body two parts, the head content to specify the next paragraph of the text of the total number of characters ( 16 binary numbers ) and the number of units (generally do not write), the body part is the actual content of the specified length, between the two parts with a carriage return line (CRLF) separated. In the last chunk of length 0 is the content called footer, which is some additional header information (which can usually be ignored directly). The specific chunk encoding format is as follows:
Chunked-body = *chunk "0" CRLF footer CRLF chunk = chunk-size [Chunk-ext] CRLF Chunk-dat A CRLF Hex-no-zero =
Finally, a PHP version of the chunked decoding code is provided:
$chunk _size = (integer) hexdec (fgets ($socket _fd, 4096)) while (!feof ($socket _fd) && $chunk _size > 0) { $b Odycontent. = Fread ($socket _fd, $chunk _size); Fread ($socket _fd, 2); skip/r/n $chunk _size = (integer) hexdec (fgets ($socket _fd, 4096));}
Ii. Introduction to the meaning of content-encoding
Content-encoding is the response header of the HTTP protocol, in general form such as:
Content-encoding:gzip,deflate,compress
Content-encoding's note indicates that deflate refers to the zlib format described in RFC1950. This means that when content-encoding is deflate, the content should be in the zlib format.
Compress says Chrome is supported, but not yet seen which Web server supports
Gzip,deflate,zlib's relationship:
Deflate (RFC1951): A compression algorithm, encoded using LZ77 and Havermann;
Zlib (RFC1950): A format that is a simple encapsulation of deflate;
Gzip (RFC1952): A format that is also an encapsulation of deflate.
It can be seen that deflate is the most core of the algorithm, while the difference between zlib and gzip format is only the head and the tail is not the same, and the actual content is deflate encoded, namely:
gzip = Gzip Header (10 bytes) + DEFLATE encoded actual content + gzip tail (8 bytes)
[Implementation of Gzip refer to Gzipoutputstream.java]
Zlib = Zlib Header + Deflate encoded actual content + zlib tail
Access www.163.com. The response message contains a GZIP header, while the www.baidu.com response message does not have a gzip header.
See gzip Everyone is well supported, there is no problem with gzip head.
(I did not verify the following)
For deflate that is zlib format:
Then on IE is not open page, including IE6,IE7,IE8, hint for a blank or error. But in other browsers such as Firefox,chrome,opera and so on can open normally. To allow IE to open the page normally, the content must be deflate the original format of the data, that is, to remove the zlib head and zlib tail. Do not know why IE does not modify this Bug, supposedly in IE6 on the emergence of this very simple problem, IE8 should not appear just right.
In order to take care of IE, had to compress the deflate when the zlib head and zlib tail, fortunately, other browsers can also be normal processing of this original deflate format.
2 ways to encode the Protocol analysis HTTP response header