ESFramework 4.0 Quick Start (07) -- playing with "Heartbeat"

Source: Internet
Author: User

A system that uses TCP for communication over the Internet will encounter a headache, that is, "disconnection ". The problem of "TCP disconnection" is far more complex than we can imagine-the network topology is complex, and from the first node A to the End Node B may have to go through more than N Switches, routers, firewalls, and other hardware devices, the related settings of each hardware device are not uniform, and the possible congestion and delay in the network make it very difficult for us to handle the disconnection during programming.

 

1. TCP disconnection from a program perspective

TCP disconnection may be caused by a variety of reasons, such, the guest's computer suddenly loses power, the OS crashes, the router restarts, the network connection is poor, the P2P download software causes a shortage of network resources, and the Internet network is unstable. However, from the perspective of the program, we can summarize the following two situations:

1. The program can immediately detect the disconnection.

That is to say, when the client is disconnected, an exception is thrown for a TCP connection thread corresponding to a read/write on the server. This situation is relatively easy to handle. In this case, ESFramework triggers the SomeOneDisconnected event of IUserManager to notify our applications.

Event CbGeneric <UserData> SomeOneDisconnected;

2. The program cannot immediately detect a disconnection.

We all know that the establishment of a TCP connection requires three handshakes, And the disconnection of a TCP connection requires four waves.

It is usually no big deal to drop the line. If you drop the line, you only need to perform some aftercare on the server and the client after four waves are successfully completed.

The trouble is that the connection is disconnected when there is no chance to complete four waves (for example, when the guest's computer system crashes or a physical network cable is disconnected between the guest's computer and the server ), the server thinks that the client is still online, and the client thinks that the client is still online. This kind of program may cause many tragedies in the real-world error judgment. For example, in this case, the client sends a command to the server, and the server remains in the waiting state because it has not received the command; when the client is ready, the server has been waiting for a reply from the server. If other parts of the program need to follow the current status for subsequent operations, a problem may occur, this is because the program judges the current connection status incorrectly.

There is no doubt that the longer the judgment on the connection status error lasts, the greater the potential harm. Of course, if we do not take any additional measures, the server will be able to feel the client's disconnection at the end of the process. However, this time may have passed several minutes or even dozens of minutes. This is intolerable for most applications. Therefore, the remedy we need to take is to help the program obtain the tcp connection disconnection information as soon as possible.

First, we can use the Socket. IOControl method to set KeepAliveValues on the Socket to control the underlying TCP retention mechanism. For example, we can set a 2-second detection and throw an exception when the detection fails for more than 10 seconds.

Byte [] inOptionValues = FillKeepAliveStruct (1, 10000,200 0 );
Socket. IOControl (IOControlCode. KeepAliveValues, inOptionValues, null );

According to our experience, this setting can solve some problems, but some connections will still be perceived after being disconnected for more than 10 seconds. Therefore, this remedy is far from enough. We also need to add our own TCP connection status detection mechanism at the application layer, which is commonly referred to as "Heartbeat ".

 

Ii. "Heartbeat" Mechanism

The heartbeat mechanism is simple: the client sends a heartbeat message to the server every N seconds. After receiving the Heartbeat message, the server returns the same Heartbeat message to the client. If the server or client does not receive any messages including heartbeat messages within M seconds (M> N), that is, the heartbeat times out, the target TCP connection is disconnected.

Because different applications have different sensitivity to TCP disconnection, the values of N and M can be set differently. The higher the sensitivity requirement, the smaller the N and M requirements. The lower the sensitivity requirement, the larger the N and M requirements. The higher the sensitivity, the more costly the Heartbeat message needs to be sent more frequently. If thousands of connections send heartbeat messages frequently at the same time, therefore, the resources it consumes cannot be ignored.

Of course, the quality of the network environment (such as the latency) also affects the setting of N and M values. For example, the network latency is large, then the difference between N and M should be larger (for example, M is 3 times of N ). Otherwise, a misjudgment may occur-that is, the TCP connection is not disconnected, but the Heartbeat message is not received in time due to the large network delay, but we think the connection has been disconnected.

ESFramework has a built-in heartbeat mechanism. When the heartbeat times out, the server will trigger the SomeOneTimeOuted event of IUserManager to notify our application.

UserManager detects HeartBeatChecker through ESBasic. Threading. Application. HeartBeatChecker, And the incluvespaninsecs attribute of HeartBeatChecker can be used to set the M value we described.

The client uses ESPlus. Application. Basic. Passive. HeartBeater to periodically send HeartBeater messages to the server. The DetectSpanInSecs attribute of HeartBeater can be used to set N values.

When you use the Rapid engine, the heartbeat components are automatically assembled for you. Since RapidServerEngine and RapidPassiveEngine do not expose HeartBeatChecker and HeartBeater, we cannot directly set M and N values through HeartBeatChecker and HeartBeater. However, rapidServerEngine and RapidPassiveEngine provide the TimeoutSpanInSecs attribute and HeartBeatSpanInSecs attribute to indirectly Set M and N.

 

3. The dropped TCP connection must be closed.

For normal disconnection (instant detection) or heartbeat timeout (not instant Detection), you must close the corresponding TCP connection to release system resources. The ITcpServerEngine interface provides the CloseOneConnection method to close the target connection.

Void CloseOneConnection (IUserAddress adderss );

When a common disconnection occurs, ITcpServerEngine automatically closes the TCP connection. However, when the heartbeat times out, we need to manually close the corresponding connection. Fortunately, if we use components in the ESPlus. Application. Basic namespace, ESPlus will automatically help us close connections that have timed out and dropped connections. The Rapid engine uses components under ESPlus. Application. Basic. Therefore, friends who use the Rapid engine do not need to manually close the connection.

Note that if the Rapid engine is used, the server will first trigger the SomeOneTimeOuted event of IUserManager when the connection times out and drops, then, the SomeOneDisconnected event of IUserManager is triggered (because ESPlus is triggered when the CloseOneConnection method is called ).

 

Iv. UDP and heartbeat"

The previous sections are about TCP disconnection. Let's take a look at UDP.

Because UDP is a connectionless protocol, when we use the ESFramework UDP engine, we almost certainly need to have a heartbeat mechanism. We use heartbeat messages to confirm that the client is still online, to ensure that the server will not release the corresponding Session or keep the expired Session for a long time.

Components related to the heartbeat mechanism in ESFramework are protocol-independent, so they can be used for both TCP applications and UDP applications. However, UDP is not encapsulated like the Rapid engine. Therefore, if you use the UDP engine of ESFramework to develop applications, You need to manually assemble components related to the heartbeat mechanism.

ESFramework 4.0 Overview (the latest version of ESFramework4.0 is included at the end of this article, and related demos and help documentation download)

All articles in ESFramework 4.0 Quick Start

All articles in ESFramework 4.0 advanced edition Series

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.