Long connections and short connections in the Network HTTP protocol (keep-alive State)

Source: Internet
Author: User
Tags http request rfc valid keep alive

What is a long connection

HTTP1.1 Specifies that long connections are maintained by default (HTTP persistent connection, also translated as persistent), data transfer is done to keep the TCP connection continuously open (no RST packet, no four handshake), waiting for the same domain name to continue to use this channel to transfer data The opposite is a short connection.

HTTP header connection:keep-alive is an experimental extension of the HTTP1.0 browser and the server, the current HTTP1.1 RFC2616 document does not explain it, because it requires the functionality has been opened by default, without it, but in practice can be found, The browser's message request will be taken with it. If the HTTP1.1 version of the HTTP request message does not wish to use a long connection, add Connection:close to the HTTP request packet header. The HTTP authoritative guide mentions that some of the old HTTP1.0 agents do not understand keep-alive, which leads to long connection failure: client-to-agent and server-side, client with keep-alive, and agent does not know, so the message is transferred intact to the server , the server responds to the keep-alive, and the agent forwards it to the client, so that the "client-to-agent" connection and the "agent-to-server" connection are not closed, but when the client sends a second request, the agent thinks the current connection is not requested and ignores it. Long connection fails. The book also describes the solution: when the HTTP version is found to be 1.0, ignore Keep-alive, the client will know that the current should not use a long connection. In fact, in the actual use do not need to consider so much, many times the agent is our own control, such as Nginx proxy, proxy server has long connection processing logic, the service side does not need to do patch processing, the common client with Nginx Proxy server using HTTP1.1 protocol & long Connection, The Nginx Proxy Server uses the HTTP1.0 protocol & short connection with the back-end server.

In actual use, the HTTP header has keep-alive This value does not mean that a long connection must be used, the client and the server can ignore this value, that is, not by standards, such as I write my own HTTP client multi-threaded to download files, you can not follow this standard, Concurrent or continuous multiple get requests, are separated in multiple TCP channels, each TCP channel, only once get,get, immediately after the TCP shutdown four handshake, so that the code is more simple, while the HTTP header is connection:keep-alive, But it cannot be said to be a long connection.      Under normal circumstances, the client browser, Web service side have to implement this standard, because their files are small and many, it is valuable to maintain long connections to reduce the cost of re-opening TCP connections. Previously used libcurl do upload/download, is a short connection, grab packet can see: 1, each TCP channel only one post;2, in the data transmission is complete can see four handshake package. As long as you do not call Curl_easy_cleanup,curl's handle, it may be valid and reusable.         This is said to be possible, because the connection is on both sides, if the server is turned off, then my client side will not be able to achieve long connection. If you are using the WinHTTP Library of Windows, while post/get the data, while I close the handle, the TCP connection does not close immediately, but waits a little while, At this point, the WinHTTP library supports the functionality required with keep-alive: Even without the Keep-alive,winhttp library, this TCP channel multiplexing feature may be added, while other network libraries like Libcurl do not.
expiration time for long connections


The keep-alive:timeout=20 in the figure above indicates that the TCP channel can remain for 20 seconds. There may also be max=xxx, which indicates that this long connection is disconnected up to a maximum of XXX requests. For the client, if the server does not tell the client time-out is OK, the service may initiate four handshake disconnection TCP connection, the client can know that the TCP connection is invalid, and TCP also has a heartbeat packet to detect whether the current connection is still alive, many ways to avoid wasting resources.
long-connected data transfer complete identificationAfter using a long connection, how does the client and server know that the transmission is over? Two parts: 1 is to determine whether the transmitted data has reached the size of the content-length indicator; 2 dynamically generated files do not have content-length, it is chunked transmission (chunked), this time is based on chunked coding to judge, chunked encoded data at the end there is an empty chunked block, indicating the end of the transfer data. A more detailed introduction can be seen in this article.

Limit number of concurrent connections
In web development, you need to focus on the number of concurrent browser connections, the RFC says that the client and server up to two channels, but the server, personal clients do not want to do so, some servers are limited to only 1 TCP connections at the same time, Causes the client to multi-threaded download (the client and the server to connect multiple TCP channels at the same time pull the data) does not play the power, some servers are not limited. Browser clients are more disciplined, restricting the ability to initiate several concurrent TCP connections to download resources under the same domain name. The number of concurrent restrictions is also associated with long connections, open a Web page, many of the resources of the download may only be placed in a few TCP connections, which is the TCP channel multiplexing (long connection). If the number of concurrent connections is small, it means that all resources on the Web page take longer to download (the user feels the page is open), and the server can generate a higher peak of resource consumption. Browsers only limit the concurrent connections under the same domain name, which means that Web developers can put resources under different domain names, and also put these resources on different machines, so that the perfect solution.

Easy to confuse concepts--tcp keep alive and HTTP keep-alive

The keep alive of TCP is to check if the current TCP connection is alive; HTTP keep-alive is to make a TCP connection live longer. They are different levels of concept.

TCP Keep Alive Performance: When a connection "for a period of time" there is no data communication, one side will issue a heartbeat packet (keep alive package), if the other party has a return packet indicates that the current connection is valid, continue to monitor. This "some time" can be set. Practice Google.

HTTP pipelining Technology

The benefits of using an HTTP long connection (HTTP persistent connection), including the ability to use HTTP pipelining technology (HTTP pipelining, also translated as a pipelined connection), which means that within a TCP connection, Multiple HTTP requests can be in parallel, and the next HTTP request is initiated before the answer to the previous HTTP request is complete. Learned from the wiki that this technology is currently not widely used, the use of this technology must require both client and server-side support, there are some browsers are fully supported, and the service side of the support only need: In order to correct the HTTP request response (that is, request & Response in FIFO mode), the wiki also pointed out that as long as the server can properly handle the use of HTTP pipelinning client requests, the server even support the HTTP pipelining.     Because the order in which the server returns the response data must be the same as the order in which the client requests it, this requires FIFO, which can easily lead to head-of-line blocking: The response of the first request affects the request behind it. For this reason, the performance improvement of HTTP pipelining technology is not obvious (wiki mentions that this problem will be solved in HTTP2.0). In addition, the use of this technology must also be idempotent HTTP methods, because the client does not know what is currently processed to the point where the retry can occur unpredictable results. The Post method is not idempotent: the same message, the first post and the second post on the service side of the performance may be different.     mentions the HTTP1.1 pipeline technology in the HTTP long-connect wiki to provide a guideline for a user up to two connections to the RfC: Pipelining is a good implementation, so multiple connections do not improve performance. I also think that, concurrency has been implemented in a single connection, multi-connection is not necessary, unless the bottleneck is the resource constraints on a single connection forcing the need to open more connections to rob resources.     Currently the browser does not pay much attention to this technology, after all, performance improvement is limited.
go from: http://www.cnblogs.com/cswuyg/p/3653263.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.