Keepalive and heartbeat packet in TCP connection Detection

Source: Internet
Author: User

The C/S software of TCP connection is used. When both parties are idle, if either party unexpectedly crashes, when the machine or network cable is disconnected or the router fails, the other party cannot know that the TCP connection has expired, unless it continues to send data on this connection, causing an error to return. Most of the time, this is not what we need. We hope that both the server side and the client can promptly and effectively detect the connection failure, and then perform some cleaning elegantly and report the error to the user.

Two technologies are available to detect abnormal disconnections of one party in a timely and effective manner. One is keepalive implemented by the TCP protocol layer, and the other is the heartbeat packet implemented by the application layer.

By default, the keepalive function is not enabled for TCP, because enabling the keepalive function requires additional bandwidth and traffic. Although this is negligible, the cost is increased in the paybytraffic environment, when keepalive settings are unreasonable, a healthy TCP connection may be disconnected due to transient network fluctuations. In addition, the default keepalive timeout requires 7,200,000 milliseconds, that is, 2 hours, and the number of probes is 5 times.

For Win2k/XP/2003, you can find the keepalive parameter that affects all connections of the entire system from the following registry key:

[HKEY_LOCAL_MACHINE/system/CurrentControlSet/services/TCPIP/parameters]
"KeepAliveTime" = DWORD: 006ddd00
"Keepaliveinterval" = DWORD: 000003e8
"Maxdataretries" = "5 ″

For practical programs, the idle time for two hours is too long. Therefore, we need to manually enable the keepalive function and set reasonable keepalive parameters.

After the keepalive option is enabled, the getqueuedcompletionstatus function returns false immediately for server programs that use the iocp model once the connection is detected to be disconnected, this allows the server to promptly clear the connection and release resources related to the connection. For the client that uses the select model, when the connection is disconnected, the Select method blocked on the socket for Recv will immediately return socket_error, so that the connection is invalid, the client program will have the opportunity to promptly clear the work, remind the user or reconnect.

Another technology is that applications send heartbeat packets to detect connection health. The client can regularly send a short and lean package to the sending server in a timer or low-level thread and wait for the server to respond. If the client program does not receive the server response within a certain period of time, the connection is considered unavailable. Similarly, if the server does not receive the heartbeat packet of the client within a certain period of time, the client is deemed to have been disconnected.

// Enable keepalivebool bkeepalive = true; int nret =: setsockopt (socket_handle, sol_socket, so_keepalive, (char *) & bkeepalive, sizeof (bkeepalive )); if (nret = socket_error) {return false;} // set the keepalive parameter tcp_keepalive alive_in = {0}; tcp_keepalive alive_out = {0}; alive_in.keepalivetime = 5000; // tcp null close time before the first keepalive probe; // interval between two keepalive probes alive_in.onoff = true; unsigned long ulbytesreturn = 0; nret = wsaioctl (outputs, outputs, & alive_in, sizeof (alive_in), & alive_out, sizeof (alive_out), & ulbytesreturn, null, null); If (nret = socket_error) {return false ;}

++ ++ ++

In Windows, "abnormal disconnection" means that TCP connections are not broken in an elegant way, such as the causes of network cable faults and other physical links, as well as sudden host power outages.

There are two ways to detect:

1. the TCP connection sends a handshake message regularly.

2. Use keepalive detection in the TCP protocol stack
The second method is simple and reliable. You only need to set keepalive detection for two sockets connected to TCP,
Therefore, this article only describes the implementation of the second method in Linux and window2000 (no further tests are conducted on other platforms)
Header files on Windows 2000

# Include <mstcpip. h>
// Define the structure and macro
Struct tcp_keepalive {
U_longonoff;
U_longkeepalivetime;
U_longkeepaliveinterval;
};

Tcp_keepalive live, liveout;
Live. keepaliveinterval = 500;
Live. KeepAliveTime = 3000;
Live. Onoff = true;
Int iret = setsockopt (socket, sol_socket, so_keepalive, (char *) opt, sizeof (INT ));
If (iret = 0 ){
DWORD dw;
If (wsaioctl (socket, sio_keepalive_vals,
& Live, sizeof (live), & liveout, sizeof (liveout ),
& DW, null, null) = socket_error ){
// Delete Client
Return;
}
}

Code under ace // By rainfish blog.csdn.net/bat603

Int opt = 1;
// During the test, the number of detection times is 5, that is, in the following settings, the interval is 5 seconds after the last message is calculated, send 5 times in a row, that is, the network is disconnected in 35 seconds.
Tcp_keepalive live, liveout;
Live. keepaliveinterval = 5000; // interval of each detection (unit: milliseconds)
Live. KeepAliveTime = 10000; // the time when the message is sent for the first time (in milliseconds)
Live. Onoff = true;
Int iret = stream. set_option (sol_socket, so_keepalive, & OPT, sizeof (INT ));
If (iret = 0 ){
DWORD dw;
// The method for obtaining the socket under Ace is shown here, that is, the socket type of the handle is the handle.
If (wsaioctl (socket) h, sio_keepalive_vals, & live, sizeof (live ),
& Liveout, sizeof (liveout), & DW, null, null) = socket_error ){
// Delete Client
Return;
}
}

Linux

# Include "/usr/include/Linux/tcp. H"
# Include "/usr/include/Linux/socket. H"
//// Keepalive implementation, in seconds
// The following code requires an ace. If it does not contain an ace, change the ace function used to the corresponding Linux interface.
Int keepalive = 1; // set keepalive
Int keepidle = 5; // tcp null close time before the first keepalive test is started
Int keepinterval = 5; // interval between two keepalive probes
Int keepcount = 3; // determines the number of keepalive probes before disconnection
If (setsockopt (S, sol_socket, so_keepalive, (void *) & keepalive, sizeof (keepalive) =-1)
{
Ace_debug (lm_info,
Ace_text ("(% p | % t) setsockopt so_keepalive error! /N ")));
}

If (setsockopt (S, sol_tcp, tcp_keepidle, (void *) & keepidle, sizeof (keepidle) =-1)
{
Ace_debug (lm_info,
Ace_text ("(% p | % t) setsockopt tcp_keepidle error! /N ")));
}

If (setsockopt (S, sol_tcp, tcp_keepintvl, (void *) & keepinterval, sizeof (keepinterval) =-1)
{
Ace_debug (lm_info,
Ace_text ("(% p | % t) setsockopt tcp_keepintvl error! /N ")));
}

If (setsockopt (S, sol_tcp, tcp_keepcnt, (void *) & keepcount, sizeof (keepcount) =-1)
{
Ace_debug (lm_info,
Ace_text ("(% p | % t) setsockopt tcp_keepcnt error! /N ")));
}

++ ++

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.