This paper summarizes & shares the concept of long connection and short connection involved in network programming.
keyword: keep-alive, concurrent connection limit, Tcp,http
First, what is a long connection
HTTP1.1 Specifies that long connections are maintained by default (HTTP persistent connection, also translated as persistent), data transfer is done to keep the TCP connection continuously open (no RST packet, no four handshake), waiting for the same domain name to continue to use this channel to transfer data The opposite is a short connection.
HTTP Header connection: Keep-alive is an experimental extension of the HTTP1.0 browser and server, and the current HTTP1.1 RFC2616 document does not explain it because the required functionality has been turned on by default, without having to carry it, but in practice it can be found that the browser's message request will take it. If the HTTP1.1 version of the HTTP request message does not wish to use a long connection, add Connection:close to the HTTP request packet header. The HTTP authoritative guide mentions that some of the old HTTP1.0 agents do not understand keep-alive, which leads to long connection failure: client-to-agent and server-side, client with keep-alive, and agent does not know, so the message is transferred intact to the server , the server responds to the keep-alive, and the agent forwards it to the client, so that the "client-to-agent" connection and the "agent-to-server" connection are not closed, but when the client sends a second request, the agent thinks the current connection is not requested and ignores it. Long connection fails. The book also describes the solution: when the HTTP version is found to be 1.0, ignore Keep-alive, the client will know that the current should not use a long connection. In fact, in the actual use do not need to consider so much, many times the agent is our own control, such as Nginx proxy, proxy server has long connection processing logic, the service side does not need to do patch processing, the common client with Nginx Proxy server using HTTP1.1 protocol & long Connection, The Nginx Proxy Server uses the HTTP1.0 protocol & short connection with the back-end server.
client multithreading to download file, Can not follow this standard, concurrent client browser, Web services end up with this standard, because their files are small and numerous, and maintaining long connections reduces the overhead of restarting TCP connections.
previously used libcurl do upload/download, is a short connection, grab packet can see: 1, each TCP channel only one post;2, in the data transmission is complete can see four handshake package. As long as you do not call Curl_easy_cleanup,curl's handle, it may be valid and reusable. This is said to be possible, because the connection is on both sides, if the server is turned off, then my client side will not be able to achieve long connection. If you are using the WinHTTP Library of Windows, while post/get the data, while I close the handle, the TCP connection does not close immediately, but waits a little while, At this point, the WinHTTP library supports the functionality required with keep-alive: Even without the Keep-alive,winhttp library, this TCP channel multiplexing feature may be added, while other network libraries like Libcurl do not. Previously observed WinHTTP libraries do not disconnect TCP connections in a timely manner .
Two, long connection expiration time
The client's long connection cannot be held indefinitely, there will be a time-out, and the server sometimes tells the client to time out, such as:The keep-alive:timeout=20 indicates that the TCP channel can remain for 20 seconds. There may also be max=xxx, which indicates that this long connection is disconnected up to a maximum of XXX requests. For the client, if the server does not tell the client time-out is OK, the service may initiate four handshake disconnection TCP connection, the client can know that the TCP connection is invalid, and TCP also has a heartbeat packet to detect whether the current connection is still alive, many ways to avoid wasting resources.
Three, long-connected data transmission complete identification
after using a long connection, how does the client and server know that the transmission is over? Two parts: 1 is to determine whether the transmitted data has reached the size of the content-length indicator; 2 dynamically generated files do not have content-length, it is chunked transmission (chunked), this time is based on chunked coding to judge, chunked encoded data at the end there is an empty chunked block, indicating the end of the transfer data. A more detailed introduction can be seen in this article.
Iv. number limit for concurrent connections
in web development, you need to focus on the number of concurrent browser connections, the RFC says that the client and server up to two channels, but the server, personal clients do not want to do so, some servers are limited to only 1 TCP connections at the same time, Causes the client to multi-threaded download (the client and the server to connect multiple TCP channels at the same time pull the data) does not play the power, some servers are not limited. Browser client is more rules, it is known that there is analysis, limited the same domain name can start a number of concurrent TCP connections to download resources. The number of concurrent restrictions is also associated with long connections, open a Web page, many of the resources of the download may only be placed in a few TCP connections, which is the TCP channel multiplexing (long connection). If the number of concurrent connections is small, it means that all resources on the Web page take longer to download (the user feels the page is open), and the server can generate a higher peak of resource consumption. Browsers only limit the concurrent connections under the same domain name, which means that Web developers can put resources under different domain names, and also put these resources on different machines, so that the perfect solution.
V. Confusing concepts-TCP's Keep alive and http keep-alive
The keep alive of TCP is to check if the current TCP connection is alive; HTTP keep-alive is to make a TCP connection live longer. They are different levels of concept. the performance of TCP keep alive:When a connection "for a period of time" does not have data communication, one party will issue a heartbeat packet (Keep alive package), if the other party has a return packet indicates that the current connection is valid, continue to monitor. This "some time" can be set. settings for the WinHTTP library:
Winhttp_option_web_socket_keepalive_interval
Sets the interval, in milliseconds, to send a keep-alive packet over the connection. The default interval is 30000 (seconds). The minimum interval is 15000 (seconds). Using winhttpsetoption to set a value lower than 15000 would return with Error_invalid_parameter.
settings for Libcurl:http://curl.haxx.se/libcurl/c/curl_easy_setopt.html
curlopt_tcp_keepalive Pass a long. If set to 1, TCP keepalive probes would be sent. The delay and frequency of these probes can be controlled by the Curlopt_tcp_keepidle and CURLOPT_TCP_KEEPINTVL options, p Rovided the operating system supports them. Set to 0 (default behavior) to disable KeepAlive probes (Added in 7.25.0). Curlopt_tcp_keepidle Pass a long. Sets the delay, in seconds, that the operating system would wait while the connection is idle before sending keepalive prob Es. Not all operating systems support this option. (Added in 7.25.0) CURLOPT_TCP_KEEPINTVL Pass a long. Sets the interval, in seconds, that the operating system would wait between sending keepalive probes. Not all operating systems support this option. (Added in 7.25.0)
Curlopt_tcp_keepidle is how long it takes to send a heartbeat packet, CURLOPT_TCP_KEEPINTVL is the heartbeat packet interval how often to send one. Open the Web Capture package, send the heartbeat packet and close the connection as follows: As can be seen, about 44 seconds after the client sent a heartbeat packet, the server responds in a timely manner, this TCP connection continues to remain. When it is idle for 60 seconds, the server actively initiates the fin packet and disconnects.
Six, HTTP pipelining technology
the benefits of using an HTTP long connection (HTTP persistent connection), including the ability to use HTTP pipelining technology (HTTP pipelining, also translated as pipelined connections), which means that
within a TCP connection, Multiple HTTP requests can be in parallel, and the next HTTP request is initiated before the answer to the previous HTTP request is complete. learned from the wiki that this technology is currently not widely used, the use of this technology must require both client and server-side support, there are some browsers are fully supported, and the service side of the support only need: In order to correct the HTTP request response (that is, request & Response in FIFO mode), the wiki also pointed out that as long as the server can properly handle the use of HTTP pipelinning client requests, the server even support the HTTP pipelining. Because the order in which the service side returns response data must be the same as the order in which the client requests it, this requires FIFO, which can easily lead to head-of-line blocking: The response of the first request affects the request behind it. For this reason, the performance improvement of HTTP pipelining technology is not obvious (wiki mentions that this problem will be solved in HTTP2.0). In addition, the use of this technology must also be idempotent HTTP methods, because the client does not know what is currently processed to the point where the retry can occur unpredictable results. The Post method is not idempotent: the same message, the first post and the second post on the service side of the performance may be different. in the HTTP long-connected wiki, it is mentioned that HTTP1.1 's pipelining technology provides a guideline for up to two connections for a user in the RfC: Pipelining is a good implementation, so multiple connections do not improve performance. I also think that, concurrency has been implemented in a single connection, multi-connection is not necessary, unless the bottleneck is the resource constraints on a single connection forcing the need to open more connections to rob resources. The browser does not attach much importance to this technology at the moment, after all, the performance gains are limited. This article is located in: http://www.cnblogs.com/cswuyg/p/3653263.html
Vii. Learning Materials
1. HTTP keep-alive mode: http://www.cnblogs.com/skynet/archive/2010/12/11/1903347.html2, the browser's concurrent request limit: http://www.zhihu.com/question/204743263. RFC Document Connection part: Http://tools.ietf.org/html/rfc2616#page-444. TCP keepalive in C/E + + network programming: http://blog.csdn.net/weiwangchao_/article/details/72253385, HTTP persistent connection:http://en.wikipedia.org/wiki/http_persistent_connection6. HTTP pipelining:http://en.wikipedia.org/wiki/http_pipelining7, Head-of-line blocking:http://en.wikipedia.org/wiki/head-of-line_blocking8, "http Authoritative guide" fourth chapter connection Management ps:2014.7.27 Second supplement: Read the "HTTP Authoritative guide" chapter fourth, to supplement the theory of long-connected knowledge.
Long connections and short connections for HTTP