KeepAlive and heartbeat packets in a TCP connection probe. Keywords: tcp keepalive, Heartbeat, keepalive

Last Update:2016-08-12 Source: Internet

Author: User

Tags keep alive

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. The need for TCP keepalive

1) Many firewalls are automatically closed for idle sockets

2) for non-normal disconnection, the server is not able to detect. In order to reclaim resources, a detection mechanism must be provided.

2. Factors that cause TCP disconnection

If the network is working properly, the socket is gracefully closed with close operation, so everything is perfect. However, there are many situations, such as a network cable failure, a sudden power outage on the client side or a crash, and so on, the server does not normally detect the disconnection of the connection.

3. Two ways of preserving life:

1) application-level heartbeat mechanism

Custom Heartbeat message headers. The general client is actively sending, and the server responds after receiving (or not responding). This is not detailed here.

PS: Someone from the functional perspective of the software to list the third way, is through the third-party software to detect, to determine the validity of the connection. This is a very restrictive approach and is not part of the software's internal implementation. not be discussed.

2) The KeepAlive function comes with TCP protocol

Turn on the keep-alive feature. Specific properties can also be set through the API.

4. The pros and cons of two different ways

The self-keepalive function of TCP protocol is simple to use and reduces the complexity of application layer code. Speculation can also be more cost-efficient, as the application layer is typically transferred to the protocol layer with an additional packet header. The detection packets provided by the TCP protocol are more subtle in theory (with fewer bytes to accomplish more) and consume less traffic.

An additional message type is defined for the heartbeat message by applying its own implementation of the heartbeat of the application layer. Is the application of the normal message package, just this special point of the package, specifically used to check live, usually smaller, may only be the message header, unless additional information is required.

The benefits of application-level heartbeat I personally understand that there are two points:

One is more flexible, because the protocol layer of the heartbeat can only provide the most pure detection function, but the application layer itself can be controlled at will, including the Protocol may provide a second level, but you want to make the millisecond level of arbitrary (although there is practically no such time-level heartbeat), the package can even carry additional information, These are the flexible places.

The second is universal, the application layer's heartbeat does not depend on the protocol. If one day without TCP to change to UDP, the protocol layer does not provide heartbeat mechanism, but your application layer of heartbeat is still common, may only need to make a few changes to continue to use.

The downside of the application layer heartbeat is also obvious, with increased development effort, and the complexity of the code structure may be increased due to the application of a specific network framework. And then according to the above speculation, the application layer of the heartbeat of the traffic consumption is still greater, after all, this is essentially a normal packet.

5. How do you choose that heartbeat?

The pros and cons of the 4th section has been elaborated, so if you can determine that the probability of your replacement agreement is very small, but also only need to check live function, then the protocol comes with absolutely OK, easy to use and efficient. Some conceited people always like to use their own, to replace the mature protocol comes with things, instead of the system kernel provides things, in fact, often your application layer implementation of things, are more clumsy. Online read some of the keep-alive of the agreement is not reliable, but also more utopian and take it for granted, have not come up with any factual arguments or experimental data. We have a point of view, Welcome to Exchange HA ~

6. How Unix-like platforms use keep-alive

KeepAlive is turned off by default because the traffic is very small, after all it is overhead. This requires the user to manually open it. There are two ways to turn it on.

1) in the code for each socket individually set up, flexible to use.

In addition to the keepalive switch, there are Keepidle, Keepinterval, Keepcount 3 properties, simple to use, as follows:

int keepAlive = 1; //Turn on the KeepAlive property. Default value: 0 (OFF)
int keepidle = 60; //If there is no data interaction within 60 seconds, the probe is performed. Default value: 7200 (s)
int keepinterval = 5; The detection packet is detected at a time interval of 5 seconds. Default value: (s)
int keepcount = 2; the number of times the retry was detected. All timeouts determine that the connection is invalid. Default value: 9 (Times)
SetSockOpt (S, Sol_socket, So_keepalive, (void*) &keepalive, sizeof (KEEPALIVE));
SetSockOpt (S, Sol_tcp, Tcp_keepidle, (void*) &keepidle, sizeof (Keepidle));
SetSockOpt (S, Sol_tcp, TCP_KEEPINTVL, (void*) &keepinterval, sizeof (keepinterval));
SetSockOpt (S, Sol_tcp, tcp_keepcnt, (void*) &keepcount, sizeof (Keepcount));

Required to use # include <NETINET/TCP.H>, otherwise sol_tcp and Tcp_keepidle and other 3 macros cannot be found.

PS: Can not help but spit groove, online a large number of not responsible for the reprint, a uniform search results, many people have not done any verification it. It cost a little to find such a file. Most of the posts are not available.

2) Modify the configuration file, which is valid for all sockets of the whole system.

We can use the cat command to see these default values in the system.

#cat/proc/sys/net/ipv4/tcp_keepalive_time 7200

#cat/PROC/SYS/NET/IPV4/TCP_KEEPALIVE_INTVL 75

#cat/proc/sys/net/ipv4/tcp_keepalive_probes 9

Modify them:

#echo >/proc/sys/net/ipv4/tcp_keepalive_time

#echo 5 >/PROC/SYS/NET/IPV4/TCP_KEEPALIVE_INTVL

#echo 3 >/proc/sys/net/ipv4/tcp_keepalive_probes

Link Recommendation:

Idle say the keepalive mechanism of heartbeat heartbeat packet and TCP protocol

TCP Keepalive HOWTO

http://blog.csdn.net/aa2650/article/details/17027845

Many application-layer protocols have a heartbeat mechanism, usually the client sends a packet to the server every other time, notifying the server that it is still online and transmitting some data that may be necessary. Typical protocols for using heartbeat packets are IM, such as qq/msn/fetion.

Students who have learned TCP/IP should know that the two main protocols of the transport layer are UDP and TCP, where UDP is non-connected, packet-oriented, and the TCP protocol is a connected, stream-oriented protocol.

So it is very easy to understand, using the UDP protocol client (such as the early "OICQ", heard that oicq.com two days was squatting, good old memories) need to send a heartbeat packet to the server, tell the server itself online.

However, MSN and now QQ tend to use TCP connection, although the TCP/IP underlying provides an optional keepalive (Ack-Ack package) mechanism, but they also implemented a higher-level heartbeat package. It seems to be a waste of traffic and a waste of CPU, a bit baffling.

Specifically check the next, TCP keepalive mechanism is this, first it seems to default is not open, to use setsockopt will sol_socket. So_keepalive is set to 1 is on, and you can set three parameter TCP_KEEPALIVE_TIME/TCP_KEEPALIVE_PROBES/TCP_KEEPALIVE_INTVL, respectively, how long the connection to start sending KeepAlive ACK packet, send several ACK packets do not reply only when the other side dead, two ACK packets between the length of the interval, in my test of Ubuntu Server 10.04 The default value is 7,200 seconds (2 hours, to not such an egg pain ah! ), 9 times, 75 seconds. So the connection has a time-out window, if there is no communication between the connection, the time window will gradually reduce, when it is reduced to zero, the TCP protocol will send an ACK flag with an empty packet (KeepAlive probe), the other side after receiving the ACK packet, if the connection is normal, Should reply to an ACK, if the connection is wrong (for example, the other party restarts, the connection status is lost), you should reply to a RST, if the other party does not reply, the server every INTVL time to send an ACK, if the successive probes packets are ignored, indicating that the connection was disconnected.

Here is a very detailed introduction to the article: Http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO, including the introduction of Keepalive, the relevant kernel parameters, C programming interface, How to enable the KeepAlive mechanism for existing applications (which can or cannot be modified) is well worth reading.

Section 2.4 of this article is about "preventing disconnection due to network inactivity",block connection interruptions due to inactive network connections (no packets for a long time), it is said that many network devices, especially NAT routers, because of their hardware limitations (such as memory, CPU processing power), can not maintain all the connections on it, so when necessary, will be in the connection pool select some inactive connection kicked off. The typical practice is LRU, the longest data-free connection to t off. By using the TCP keepalive mechanism (which modifies the time parameter), you can make the connection generate some ACK packets every few minutes to reduce the risk of being dropped by T, which, of course, is at the expense of additional network and CPU burdens.

As mentioned earlier, many IM protocols implement their own heartbeat mechanism, rather than relying directly on the underlying mechanism, and do not know what the real cause is.

In my opinion, some simple protocols, directly using the underlying mechanism can be completely transparent to the upper layer, reducing the development difficulty, do not need to manage the status of the connection. The protocols that implement the heartbeat mechanism are expected to transmit some data at the same time as the heartbeat packet is sent, so that the server can learn more about the state. For example, some clients like to collect information about a user ... Anyway is to send a package, it is better to plug some data, or Baotou and wasted ...

That is probably the case, if you have Daniel know the real reason, but also hope to enlighten.

@2012-04-21

P.S. By consulting a colleague who has done IM, the answer should be that the heartbeat mechanism that you implement is common and ignores the underlying UDP or TCP protocol. If it is only with the TCP protocol, then the direct use of the keepalive mechanism is sufficient.

@2015-09-14
To add @Jack response:
"In addition to demonstrating that the application is still alive (the process is still on, the network is unobstructed), it is more important to indicate that the application is working properly. While the TCP keepalive has an operating system for sniffing, even if the process is deadlocked or blocked, the operating system will send and receive TCP keepalive messages as usual. The other person is not aware of this exception. From "Linux Multithreaded Server Programming"

Reprint please specify from https://www.felix021.com/blog/read.php?2076 , if reproduced text is indicated original source, thank you:)

1.KeepAlive mechanism can not be detected in many cases, such as the network connection is disabled by the software, not reliable, the network state is complex situation this situation is particularly serious.
2. Self-realization of the heartbeat can be added more flexible and practical mechanisms, such as the lack of a heartbeat, you can check again immediately, check the interval is reduced, so that you can more quickly sense the network state, rather than waiting for a fixed time.

The advantage of application-level heartbeat is that they enable you to understand whether applications at both ends exist, not just communication software.

TCP's keep Alive is very resource-intensive. In general, the CPU computing resources of the server is much more than memory and IO, so I see a lot of Web servers set the keep alive very low, so that if necessary, re-establish the connection, consume CPU for more memory that can be used. Maybe it's not a thing to say to you ...
The problem is that the heartbeat achieved at the application level is certainly more resource-intensive than the underlying protocol. So I think the key is to consume this resource is not necessary, if the resources, how to maximize the utilization of the consumed resources, right.

KeepAlive and heartbeat packets in a TCP connection probe. Keywords: tcp keepalive, Heartbeat, keepalive

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More