KeepAlive and heartbeat packets in TCP connection detection

Source: Internet
Author: User

Using the C/S mode software with TCP connection, if either side crashes unexpectedly, when the machine, the network cable disconnects or the router fails, the other party cannot know that the TCP connection has been invalidated unless it continues to send data on the connection. Most of the time, this is not what we need. We want both the server side and the client to be able to detect the connection failure in a timely and efficient manner, and then gracefully complete some cleanup and report the error to the user.

How to detect the abnormal disconnection of one side in time and effectively, there are two kinds of technology can be used. One is the keepalive implemented by the TCP protocol layer, and the other is the heartbeat package implemented by the application layer itself.

TCP does not turn on the KeepAlive feature by default, because opening the KeepAlive feature consumes additional bandwidth and traffic, although this is trivial but increases the cost in a metered environment, on the other hand, KeepAlive settings are unreasonable when you can disconnect a healthy TCP connection due to transient network fluctuations. Also, the default keepalive timeout requires 7,200,000 milliseconds, which is 2 hours, and the number of probes is 5 times.

For win2k/xp/2003, you can find the KeepAlive parameter that affects all connections to the entire system from the following registry key:

[Hkey_local_machine/system/currentcontrolset/services/tcpip/parameters]

"KeepAliveTime" =dword:006ddd00

"KeepAliveInterval" =dword:000003e8

"MaxDataRetries" = "5″

For practical applications, 2 hours of idle time is too long. Therefore, we need to manually turn on the KeepAlive function and set reasonable keepalive parameters.

Open KeepAlive

BOOL bkeepalive = TRUE;

int nret =:: setsockopt (Socket_handle, Sol_socket, So_keepalive, (char*) &bkeepalive, sizeof (bkeepalive));

if (nret = = socket_error)

{

return FALSE;

}

Setting the KeepAlive parameter

Tcp_keepalive alive_in = {0};

Tcp_keepalive alive_out = {0};

Alive_in.keepalivetime = 5000; TCP null-closed time before first keepalive detection started

Alive_in.keepaliveinterval = 1000; Two time intervals between keepalive probes

Alive_in.onoff = TRUE;

unsigned long ulbytesreturn = 0;

Nret = WSAIoctl (Socket_handle, sio_keepalive_vals, &alive_in, sizeof (alive_in),

&alive_out, sizeof (alive_out), &ulbytesreturn, NULL, NULL);

if (nret = = socket_error)

{

return FALSE;

}

After the KeepAlive option is turned on, for server-side programs that use the IOCP model, once the connection is detected, the GetQueuedCompletionStatus function returns false immediately, allowing the server to clear the connection and release the connection-related resources in a timely manner. For clients using the Select model, a select method that blocks on the socket for recv purposes immediately returns to Socket_error, knowing that the connection is invalidated and the client program has the opportunity to perform cleanup, alert the user, or reconnect in time.

Another technique by which the application sends its own heartbeat packet to detect the health of the connection. The client can periodically send a short package to the outgoing server in a timer or a low-level thread, and wait for the server to respond. The client program does not receive a server response for a certain amount of time that the connection is not available, similarly, the server does not receive the client's heartbeat package for a certain amount of time that the client has dropped.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

The "abnormal disconnection" under Windows here means that the TCP connection is not broken gracefully, such as a physical link such as a network cable failure, and a sudden host power outage.

There are two ways to detect:

1.TCP Connect both parties to a timed handshake message


2. Using keepalive detection in the TCP protocol stack
The second method is simple and reliable, just set the keepalive probe to TCP connection two sockets,
So this article only describes the implementation of the second method under linux,window2000 (no further testing on other platforms)
Windows 2000 platform head file


#include <mstcpip.h>
Defining structures and macros
struct Tcp_keepalive {
U_longonoff;
U_longkeepalivetime;
U_longkeepaliveinterval;
} ;


tcp_keepalive live,liveout;
live.keepaliveinterval=500;
live.keepalivetime=3000;
live.onoff=true;
int iRet = setsockopt (socket,sol_socket,so_keepalive, (char *) opt,sizeof (int));
if (IRet = = 0) {
DWORD DW;
if (WSAIoctl (Socket,sio_keepalive_vals,
&live,sizeof (Live), &liveout,sizeof (liveout),
&dw,null,null) = = Socket_error) {
//delete Client
return;
     }  

Ace under code//by rainfish blog.csdn.net/bat603

int Opt = 1;
//During the test, the number of detections detected is 5, that is, the following settings, from the last message began the calculation of 10 seconds, each interval of 5 seconds, 5 consecutive times, that is, 35 seconds to discover the network is broken
tcp_keepalive live,liveout;
live.keepaliveinterval=5000;//interval per detection (in milliseconds)
live.keepalivetime=10000; The time (in milliseconds) to start sending the first time
live.onoff=true;
int iRet = stream.set_option (sol_socket,so_keepalive,&opt,sizeof (int));
if (IRet = = 0) {
DWORD DW;
///This shows how to get a socket under Ace, that is, the handle (socket) is the handle
if (WSAIoctl (SOCKET) h,sio_keepalive_vals,&live,sizeof (live),
&liveout,sizeof (liveout), &dw,null,null) = = Socket_error) {
//delete Client
return;
     }  


Under Linux Platform


#include "/usr/include/linux/tcp.h"
#include "/usr/include/linux/socket.h"
////keepalive Implementation, per second
//The code below requires Aces, if not included, change the ACE function to the appropriate interface for Linux
int keepAlive = 1;//set keepAlive
int keepidle = 5;//The TCP null time before the first keepalive probe is started
int keepinterval = 5;//two time interval between keepalive probes
int keepcount = 3;//To determine the number of keepalive probes before disconnecting
if (setsockopt (s,sol_socket,so_keepalive, (void*) &keepalive,sizeof (KEEPALIVE)) = =-1)
{
Ace_debug (Lm_info,
Ace_text ("(%p|%t) setsockopt so_keepalive error!/n" ));
}

if (SetSockOpt (S,sol_tcp,tcp_keepidle, (void *) &keepidle,sizeof (keepidle)) = =-1)
{
    Ace_debug (lm_ INFO,
    Ace_text (%P |%t) setsockopt tcp_keepidle error!/n ")));

if (SetSockOpt (S,SOL_TCP,TCP_KEEPINTVL, (void * ) &keepinterval,sizeof (keepinterval)) = =-1)
    Ace_debug (lm_info,
    Ace_text ("(%p|%t) setsockopt TCP_KEEPINTVL error!/n ")));

if (setsockopt (s,sol_tcp,tcp_keepcnt, (void *) &keepcount,sizeof (keepcount)) = =-1)
{
Ace_debug (Lm_info,
Ace_text ("(%p|%t) setsockopt tcp_keepcnt error!/n"));
}

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

KeepAlive and heartbeat packets in TCP connection detection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.