This paper summarizes & shares the concept of long connection and short connection involved in network programming.
keyword: keep-alive, concurrent connection limit, Tcp,http
First, what is a long connection
HTTP1.1 Specifies that long connections are maintained by default (HTTP persistent connection, also translated as persistent), data transfer is done to keep the TCP connection continuously open (no RST packet, no four handshake), waiting for the same domain name to continue to use this channel to transfer data The opposite is a short connection.
HTTP header connection:keep-alive is an experimental extension of the HTTP1.0 browser and the server, the current HTTP1.1 RFC2616 document does not explain it, because it requires the functionality has been opened by default, without it, but in practice can be found, The browser's message request will be taken with it. If the HTTP1.1 version of the HTTP request message does not wish to use a long connection, add Connection:close to the HTTP request packet header. The HTTP authoritative guide mentions that some of the old HTTP1.0 agents do not understand keep-alive, which leads to long connection failure: client-to-agent and server-side, client with keep-alive, and agent does not know, so the message is transferred intact to the server , the server responds to the keep-alive, and the agent forwards it to the client, so that the "client-to-agent" connection and the "agent-to-server" connection are not closed, but when the client sends a second request, the agent thinks the current connection is not requested and ignores it. Long connection fails. The book also describes the solution: when the HTTP version is found to be 1.0, ignore Keep-alive, the client will know that the current should not use a long connection. In fact, in the actual use do not need to consider so much, many times the agent is our own control, such as Nginx proxy, proxy server has long connection processing logic, the service side does not need to do patch processing, the common client with Nginx Proxy server using HTTP1.1 protocol & long Connection, The Nginx Proxy Server uses the HTTP1.0 protocol & short connection with the back-end server.
In actual use, the HTTP header has keep-alive This value does not mean that a long connection must be used, the client and the server can ignore this value, that is, not by standards, such as I write my own HTTP client multi-threaded to download files, you can not follow this standard, Concurrent or continuous multiple get requests, are separated in multiple TCP channels, each TCP channel, only once get,get, immediately after the TCP shutdown four handshake, so that the code is more simple, while the HTTP header is connection:keep-alive, But it cannot be said to be a long connection. Under normal circumstances, the client browser, Web service side have to implement this standard, because their files are small and many, it is valuable to maintain long connections to reduce the cost of re-opening TCP connections.
Previously used libcurl do upload/download, is a short connection, grab packet can see: 1, each TCP channel only one post;2, in the data transmission is complete can see four handshake package. As long as you do not call Curl_easy_cleanup,curl's handle, it may be valid and reusable. This is said to be possible, because the connection is on both sides, if the server is turned off, then my client side will not be able to achieve long connection. If you are using the WinHTTP Library of Windows, while post/get the data, while I close the handle, the TCP connection does not close immediately, but waits a little while, At this point, the WinHTTP library supports the functionality required with keep-alive: Even without the Keep-alive,winhttp library, this TCP channel multiplexing feature may be added, while other network libraries like Libcurl do not. Previously observed WinHTTP libraries do not disconnect TCP connections in a timely manner.
Two, long connection expiration time
The long connection of the client can not be held indefinitely, there will be a timeout, the server sometimes tells the client time-out, such as: Keep-alive:timeout=20, indicating that the TCP channel can be kept for 20 seconds. There may also be max=xxx, which indicates that this long connection is disconnected up to a maximum of XXX requests. For the client, if the server does not tell the client time-out is OK, the service may initiate four handshake disconnection TCP connection, the client can know that the TCP connection is invalid, and TCP also has a heartbeat packet to detect whether the current connection is still alive, many ways to avoid wasting resources.
Three, long-connected data transmission complete identification
After using a long connection, how does the client and server know that the transmission is over? Two parts: 1 is to determine whether the transmitted data has reached the size of the content-length indicator; 2 dynamically generated files do not have content-length, it is chunked transmission (chunked), this time is based on chunked coding to judge, chunked encoded data at the end there is an empty chunked block, indicating the end of the transfer data. A more detailed introduction can be seen in this article.
Iv. number limit for concurrent connections
In web development, you need to focus on the number of concurrent browser connections, the RFC says that the client and server up to two channels, but the server, personal clients do not want to do so, some servers are limited to only 1 TCP connections at the same time, Causes the client to multi-threaded download (the client and the server to connect multiple TCP channels at the same time pull the data) does not play the power, some servers are not limited. Browser client is more rules, it is known that there is analysis, limited the same domain name can start a number of concurrent TCP connections to download resources. The number of concurrent restrictions is also associated with long connections, open a Web page, many of the resources of the download may only be placed in a few TCP connections, which is the TCP channel multiplexing (long connection). If the number of concurrent connections is small, it means that all resources on the Web page take longer to download (the user feels the page is open), and the server can generate a higher peak of resource consumption. Browsers only limit the concurrent connections under the same domain name, which means that Web developers can put resources under different domain names, and also put these resources on different machines, so that the perfect solution.
V. Confusing concepts-TCP's Keep alive and http keep-alive
The keep alive of TCP is to check if the current TCP connection is alive; HTTP keep-alive is to make a TCP connection live longer. They are different levels of concept. TCP Keep Alive Performance: When a connection "for a period of time" there is no data communication, one side will issue a heartbeat packet (keep alive package), if the other party has a return packet indicates that the current connection is valid, continue to monitor. This "some time" can be set. Settings for the WinHTTP library:
Winhttp_option_web_socket_keepalive_interval
Sets the interval, in milliseconds, to send a keep-alive packet over the connection. The default interval is 30000 (seconds). The minimum interval is 15000 (seconds). Using winhttpsetoption to set a value lower than 15000 would return with Error_invalid_parameter.
Libcurl's setting: http://curl.haxx.se/libcurl/c/curl_easy_setopt.html
Curlopt_tcp_keepalivepass a long. If set to 1, TCP keepalive probes would be sent. The delay and frequency of these probes can be controlled by the Curlopt_tcp_keepidle and CURLOPT_TCP_KEEPINTVL options, p Rovided the operating system supports them. Set to 0 (default behavior) to disable KeepAlive probes (Added in 7.25.0). Curlopt_tcp_keepidlepass a long. Sets the delay, in seconds, that the operating system would wait while the connection is idle before sending keepalive prob Es. Not all operating systems support this option. (Added in 7.25.0) Curlopt_tcp_keepintvlpass a long. Sets the interval, in seconds, that the operating system would wait between sending keepalive probes. Not all operating systems support this option. (Added in 7.25.0)
Curlopt_tcp_keepidle is how long it takes to send a heartbeat packet, CURLOPT_TCP_KEEPINTVL is the heartbeat packet interval how often to send one. Open the Web Capture package, send the heartbeat packet and close the connection as follows: From you can see, about 44 seconds, the client sent a heartbeat packet, the server responds in a timely manner, this TCP connection continues to remain. When it is idle for 60 seconds, the server actively initiates the fin packet and disconnects.
Six, HTTP pipelining technology
The benefits of using HTTP long connections (HTTP persistent connection), including the ability to use HTTP pipelining technology (HTTP pipelining, also translated as pipelined connections), which are referred to as
within a TCP connection, multiple HTTP requests can be in parallel, and the next HTTP request is initiated before the answer to the previous HTTP request is complete. Learned from the wiki that this technology is currently not widely used, the use of this technology must require both client and server-side support, there are some browsers are fully supported, and the service side of the support only need: In order to correct the HTTP request response (that is, request & Response in FIFO mode), the wiki also pointed out that as long as the server can properly handle the use of HTTP pipelinning client requests, the server even support the HTTP pipelining. Because the order in which the service side returns response data must be the same as the order in which the client requests it, this requires FIFO, which can easily lead to head-of-line blocking: The response of the first request affects the request behind it. For this reason, the performance improvement of HTTP pipelining technology is not obvious (wiki mentions that this problem will be solved in HTTP2.0). In addition, the use of this technology must also be idempotent HTTP methods, because the client does not know what is currently processed to the point where the retry can occur unpredictable results. The Post method is not idempotent: the same message, the first post and the second post on the service side of the performance may be different. In the HTTP long-connected wiki, it is mentioned that HTTP1.1 's pipelining technology provides a guideline for up to two connections for a user in the RfC: Pipelining is a good implementation, so multiple connections do not improve performance. I also think that, concurrency has been implemented in a single connection, multi-connection is not necessary, unless the bottleneck is the resource constraints on a single connection forcing the need to open more connections to rob resources. The browser does not attach much importance to this technology at the moment, after all, the performance gains are limited. This article is located in: http://www.cnblogs.com/cswuyg/p/3653263.html
Vii. Learning Materials
1, HTTP keep-alive mode: HTTP://WWW.CNBLOGS.COM/SKYNET/ARCHIVE/2010/12/11/1903347.HTML2, browser's concurrent request limit:/HTTP www.zhihu.com/question/204743263, RFC Document Connection section: http://tools.ietf.org/html/rfc2616#page-444, C + + TCP KeepAlive in Network programming: http://blog.csdn.net/weiwangchao_/article/details/72253385, HTTP persistent connection:http:// En.wikipedia.org/wiki/http_persistent_connection6, HTTP Pipelining:http://en.wikipedia.org/wiki/http_pipelining7 , Head-of-line Blocking:http://en.wikipedia.org/wiki/head-of-line_blocking8, "http Authoritative guide" fourth chapter connection management ps:2014.7.27 Second supplement: Read the fourth chapter of the HTTP Authoritative Guide to supplement the theoretical knowledge of long connections.
Long connections and short connections for HTTP