1. What is keep-alive mode?
We know that the HTTP protocol uses "request-answer" mode, when using normal mode, that is, non-keepalive mode, each request/reply client and server to create a new connection, immediately after the completion of the connection (HTTP protocol is a non-connected protocol) When using Keep-alive mode (also known as persistent connection, connection reuse), the Keep-alive feature keeps the client-to-server connection active, and the keep-alive feature avoids establishing or re-establishing a connection when a subsequent request to the server occurs.
HTTP 1.0 is turned off by default, you need to add "connection:keep-alive" in the HTTP header to enable Keep-alive;http 1.1 by default enabling Keep-alive, if you join "Connection:close", Before closing. Most browsers now use the http1.1 protocol, which means that the Keep-alive connection request is initiated by default, so whether a full keep-alive connection can be completed depends on the server setup.
2. Advantages of enabling Keep-alive
From the above analysis, enabling the keep-alive mode is certainly more efficient and more performance. Because the cost of establishing/releasing the connection is avoided. The following is a summary on RFC 2616:
By opening and closing fewer TCP connections, CPU time was saved in routers and hosts (clients, servers, proxies, gateways, tunnels, or caches), and memory used for TCP protocol control blocks can is saved in hosts.
HTTP requests and responses can is pipelined on a connection. Pipelining allows a client to make multiple requests without waiting for each response, allowing a single TCP connection t o be used much more efficiently, with much lower elapsed time.
Network congestion is reduced by reducing the number of packets caused by TCP opens, and by allowing TCP sufficient time t o Determine the congestion state of the network.
Latency on subsequent requests is reduced since there are no time spent in TCP ' s connection opening.
HTTP can evolve more gracefully, since errors can be reported without the penalty of closing the TCP connection. Clients using future versions of HTTP might optimistically try a new feature, but if communicating with an older server, r Etry with the old semantics after a error is reported.
RFC 2616 (P47) also states that the number of connections between a single-user client and any server or agent should not exceed 2. An active concurrent connection with no more than 2 * n should be used between an agent and other servers or code. This is to increase the HTTP response time and avoid congestion (redundant connections do not improve the performance of code execution).
3. Back to our question (i.e. how to determine the size of the message content/length?) )
Keep-alive mode, how the client determines that the response data obtained by the request has been received (or how to know that the server has finished the data)? We already know, Keep-alive mode send play data HTTP server does not automatically disconnect, all can no longer use return EOF (-1) to judge (of course you have to use this and no way, you can imagine how low efficiency)! Let me show you two ways to judge.
3.1. Use the Message header field Conent-length
Therefore, conent-length represents the length of the entity content, and the client (server) can determine whether the data is received or not, based on this value. But if there is no conent-length in the message, then how to judge it? And under what circumstances will there be no conent-length? Please keep looking down ...
3.2. Use the Message header field transfer-encoding
When a client requests a static page or a picture from the server, the server knows exactly what the content is, and then tells the client how much data it needs to receive through the Content-length message header field. However, if it is a dynamic page, and so on, the server is not possible to pre-know the content size, then you can use the Transfer-encoding:chunk mode to transfer data. That is, if you want to generate data on one side and send it to the client, the server needs to use "transfer-encoding:chunked" instead of content-length.
The chunk code divides the data into a piece of the occurrence. The chunked encoding will be concatenated with a number of chunk, ending with a chunk marked with a length of 0. Each chunk is divided into the head and the body two parts, the head content specifies the total number of characters of the body (16 binary numbers) and the number of units (generally do not write), the body part is the actual content of the specified length, separated by a carriage return line (CRLF) between the two parts. In the last chunk of length 0 is the content called footer, which is some additional header information (which can usually be ignored directly). The format of the chunk encoding is as follows:
Copy Code
The code is as follows:
Chunked-body = *<strong>chunk </strong>
"0" CRLF
Footer
CRLF
Chunk = chunk-size [Chunk-ext] CRLF
Chunk-data Crlf</p><p>hex-no-zero = < HEX excluding "0" ></p><p>chunk-size = Hex-no-zero *hex
Chunk-ext = * (";" chunk-ext-name ["=" chunk-ext-value])
Chunk-ext-name = Token
Chunk-ext-val = Token | Quoted-string
Chunk-data = Chunk-size (OCTET) </p><p>footer = *entity-header
That is, chunk encoding consists of four parts: 1, <strong>0 up to chunk block </strong>, 2, <strong> "0" CRLF </strong>,3, <strong >footer </strong>,4, <strong>CRLF</strong> <strong>.</strong> And each chunk block consists of: Chunk-size, Chunk-ext (optional), CRLF, Chunk-data, CRLF.
4. Summary of message length
In fact, the above 2 methods can be summed up as how to determine the size of the HTTP message, the number of messages. The length of the message is summarized in RFC 2616 as follows: the Transfer-length (transmission length) of a message refers to the length of the Message-body (message body) in the message. When transfer-coding (transfer encoding) is applied, the length of message-body (message body) in each message (TRANSFER-LENGTH) is determined by the following conditions (priority is high to low):
Any message that does not contain a message body, such as a response message such as 1XXX, 204, 304, and any header (head, header) request, is always terminated by a blank line (CLRF).
If the Transfer-encoding header field is present and the value is not "identity", then transfer-length is defined by the "chunked" transport encoding unless the message terminates because the connection was closed.
If the Content-length header field appears, its value represents entity-length (solid length) and transfer-length (transfer length). If the size of the two lengths is different (i.e. transfer-encoding header field is set), the Content-length header field cannot be sent. And if you receive both the Transfer-encoding field and the Content-length header field, you must omit the Content-length field.
If the message uses the media type "multipart/byteranges" and transfer-length is not otherwise specified, then this custom bound (self-delimiting) media type defines transfer-length. The type cannot be used unless the sender knows that the recipient can resolve the type.
The connection is closed by the server to determine the message length. (Note: Closing a connection cannot be used to determine the end of a request message because the server can no longer send a response message to the client.) )
In order to be compatible with http/1.0 applications, the http/1.1 request message body must contain a valid Content-length header field unless you know that the server is compatible with http/1.1. A request contains the body of the message, and the Content-length field is not given, if the length of the message cannot be determined, the server should respond with a "bad request", or the server insists on receiving a valid Content-length field with 411 ( Length required) to respond.
All http/1.1 recipient applications must accept the "chunked" transfer-coding (transfer encoding), so this mechanism is allowed to transmit messages when the length of the message cannot be known beforehand. The message should not be sufficient to contain both the Content-length header field and the Non-identity transfer-coding. If a message contains both non-identity transfer-coding and content-length, content-length must be ignored.
5, HTTP header field summary
Finally I summarize the header fields of the HTTP protocol.
1, accept: Tell the Web server to accept what the media type,/represents any type, type/* represents all subtypes under that type, Type/sub-type.
2, Accept-charset: The browser affirms that it receives the character set accept-encoding: The browser affirms itself to receive the encoding method, usually specifies the compression method, whether compression is supported, what compression method is supported (Gzip,deflate) Accept-language: The browser affirms that the language you receive differs from the character set: Chinese is language, Chinese has multiple character sets, such as BIG5,GB2312,GBK and so on.
3. The Accept-ranges:web server indicates whether it accepts requests to obtain part of an entity (such as a part of a file). Bytes: Accept, none: Indicates not accepted.
4. Age: When the proxy server responds to a request with its own cached entity, it is used to indicate how long the entity has been from the time it was produced to the present.
5. Authorization: When the client receives the Www-authenticate response from the Web server, it uses the header to respond to its own authentication information to the Web server.
6, Cache-control: Request: No-cache (do not cache the entity, request now from the Web server to fetch) Max-age: (Only accept the age value is less than the Max-age value, and there is no expired object) Max-stale: (Can accept past objects, However, the expiration time must be less than the Max-stale value) Min-fresh: (accepts cached objects whose freshness life is greater than the sum of its current age and Min-fresh values) Response: public (can respond to any user with Cached content) private (only cached content Response to the user who previously requested the content) No-cache (can be cached, but only after the Web server has verified that it is valid to return to the client) Max-age: (The expiration time of the object contained in this response) All:no-store (cache not allowed)
7, Connection: Request: Close (Tell the Web server or proxy server, after completing the response of this request, disconnect, do not wait for subsequent requests for this connection). KeepAlive (tells the Web server or proxy server, after completing the response of this request, remains connected, waiting for subsequent requests for this connection). Response: Close (the connection is closed). KeepAlive (connection is maintained, waiting for subsequent requests for this connection). Keep-alive: If the browser requests to remain connected, the header indicates how long (in seconds) you want the WEB server to remain connected. Example: keep-alive:300
8. The Content-encoding:web server indicates what compression method (Gzip,deflate) is used to compress the object in the response. Example: Content-encoding:gzip
9. The Content-language:web server tells the browser the language of the object it responds to.
10. The Content-length:web server tells the browser the length of the object it responds to. Example: content-length:26012
11. The Content-range:web server indicates which part of the entire object the response contains. Example: Content-range:bytes 21010-47021/47022
12. The Content-type:web server tells the browser what type of object it responds to. Example: Content-type:application/xml
13, ETag: is an object (such as the URL) of the flag value, in terms of an object, such as an HTML file, if modified, its etag will not be modified, so the role of the etag is similar to the role of last-modified, mainly for the WEB The server determines whether an object has changed. For example, when a previous request for an HTML file, the ETag was obtained, and when the file is requested, the browser will send the previously obtained ETag value to the Web server, and then the Web server will compare the ETag with the file's current etag, and then know that the file has not changed The
14. The Expired:web server indicates when the entity will expire and, for expired objects, can be used to respond to customer requests only after it has verified its validity with the Web server. It's http/1.0 's head. Example: Expires:sat, 10:02:12 GMT
15. Host: The client specifies the domain/IP address and port number of the Web server that you want to access. Example: Host:rss.sina.com.cn
16, If-match: If the object's ETag does not change, it means that the object does not change, only to perform the requested action.
17. If-none-match: If the ETag of an object changes, it also means that the object has changed to perform the requested action.
18. If-modified-since: If the requested object is modified after the time specified by the header, the requested action (such as the return object) is executed, otherwise the code 304 is returned, telling the browser that the object has not been modified. Example: If-modified-since:thu, APR 09:14:42 GMT
19. If-unmodified-since: If the requested object is not modified after the time specified in the header, the requested action (such as returning an object) is performed.
20. If-range: The browser tells the WEB server that if the object I requested does not change, give me the missing part, and if the object changes, give me the whole object. The browser can tell the WEB server whether the object has changed by sending the ETag of the requested object or the last modification time it knows. Always used with the Range header.
21, the Last-modified:web server considers the last modification time of the object, such as the last modification time of the file, the last generation time of the dynamic page, etc. For example: Last-modified:tue, May 02:42:43 GMT
22. The Location:web server tells the browser that the object you are trying to access has been moved to a different location to fetch it at the location specified by the header. such as:location:http://i0.sinaimg.cn/dy/deco/2008/0528/sinahome_0803_ws_005_text_0.gif</a>
23, Pramga: The main use of pramga:no-cache, equivalent to Cache-control:no-cache. Example: Pragma:no-cache
24. Proxy-authenticate: The proxy server responds to the browser and requires it to provide proxy authentication information. Proxy-authorization: The browser responds to the proxy server's authentication request and provides its own identity information.
25. Range: The browser (such as Flashget multi-threaded download) tells the WEB server what part of the object you want to take. Example: range:bytes=1173546-
26. Referer: The browser indicates to the Web server which page/url obtained/clicked on the URL/url in the current request. such as:referer:http://www.sina.com/</a>
27, Server:web server indicates what software and version of the information. Example: server:apache/2.0.61 (Unix)
28. User-agent: The browser indicates its identity (which browser). For example: user-agent:mozilla/5.0 (Windows; U Windows NT 5.1; ZH-CN; rv:1.8.1.14) gecko/20080404 firefox/2, 0, 0, 14
29. Transfer-encoding:web server indicates how to encode the response message body (not the object inside the message body), such as whether it is chunked (chunked). Example: transfer-encoding:chunked
30. The Vary:web server uses the contents of the header to tell the Cache server under what conditions the object returned by this response responds to subsequent requests. If the source Web server receives the first request message, the header of its response message is: Content-encoding:gzip; Vary:content-encoding then the cache server parses the header of the subsequent request message and checks if its accept-encoding is consistent with the Vary header value of the previous response, that is, whether the same content encoding method is used, which prevents the cache The server responds to the compressed entity in its own Cache to a browser that does not have the ability to decompress. Example: vary:accept-encoding
31. Via: Lists the proxies from the client to the OCS or in the opposite direction, and what protocol (and version) they are using to send the request. When the client request arrives at the first proxy server, the server adds via header in its own request and fills in its own information, and when the next proxy receives a request from the first proxy server, it copies the Via header of the request from the previous proxy server in its own request. and add their own information to the back, and so on, when OCS receives the last Proxy server request, check Via header, know the route that the request passes. Example: via:1.0 236.d0707195.sina.com.cn:80 (SQUID/2.6.STABLE13)
=============================================================================== HTTP Request message Header instance: Host:rss.sina.com.cn USER-AGENT:MOZILLA/5, 0 (Windows; U Windows NT 5, 1; ZH-CN; Rv:1, 8, 1,) gecko/20080404 firefox/2, 0, 0, accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0, 9,text/plain;q=0, 8,image/png,/;q=0, 5 accept-language:zh-cn,zh;q=0, 5 accept-encoding:gzip,deflate Accept-charset:gb2312,utf-8;q=0, 7,*;q=0, 7 keep-alive:300 connection:keep-alive cookie:userid= C5BYPXRIMDMSIQMSBPNE1VN8ZQMDWSM3WRLEB3VRWTNRTW <--Cookie If-modified-since:sun, June 12:05:30 GMT Cache-control:max-age=0 HTTP Response message Header instance: status:ok-200-response status code, indicating the results of the Web server processing. Date:sun, 12:35:47 GMT server:apache/2.0.61 (Unix) Last-modified:sun, June 12:35:30 GMT Accept-ranges:b Ytes content-length:18616 cache-control:max-age=120 Expires:sun, June 12:37:47 GMT Content-type:application/xml Age:2 X-cache:hit from 236-41.d07071951.sina.com.cn--HTTP header used by reverse proxy server via:1.0 236-41.D07071951.SINA.COM.CN:80 (SQUID/2.6.STABLE13) connection:close
HTTP protocol keep-alive mode and HTTP header field summary