Esframework development manual (07)-Heartbeat Mechanism

Source: Internet
Author: User
ArticleDirectory
    • The program immediately detects a disconnection.

Although we have already described the various infrastructure required for esframework development, it is not enough. To make better use of esframework, we must understand some background knowledge. Just like the heartbeat mechanism introduced in this article, it is an indispensable mechanism in communication systems under severe Internet conditions.

A system that uses TCP for communication over the internet will encounter a headache, that is, "disconnection ". The problem of "TCP disconnection" is far more complex than we can imagine-the network topology is complex, and from the first node A to the End Node B may have to go through more than N Switches, routers, firewalls, and other hardware devices, the related settings of each hardware device are not uniform, and the possible congestion and delay in the network make it very difficult for us to handle the disconnection during programming.

1. Slave Program TCP disconnection

TCP disconnection may be caused by a variety of reasons, such, the guest's computer suddenly loses power, the OS crashes, the router restarts, the network connection is poor, the P2P download software causes a shortage of network resources, and the Internet network is unstable. However, from the perspective of the program, we can conclude that the program can immediately detect the disconnection and the program cannot immediately detect the disconnection.

The program immediately detects a disconnection.

That is to say, when the client is disconnected, the TCP connection thread corresponding to a read/write on the server will throw an exception, which is relatively easy to handle. In this case, esframework triggers the someonedisconnected event of iusermanager to notify our applications.

 
///<Summary>
///This event is triggered when the client connection is closed. Do not remotely schedule this event.
///</Summary>
Event Cbgeneric<Userdata,Disconnectedtype> Someonedisconnected;
The program cannot immediately detect a disconnection.

We all know that the establishment of a TCP connection requires three handshakes, And the disconnection of a TCP connection requires four waves. It is usually no big deal to drop the line. If you drop the line, you only need to perform some aftercare on the server and the client after four waves are successfully completed.

The trouble is that the connection is disconnected when there is no chance to complete four waves (for example, when the guest's computer system crashes or a physical network cable is disconnected between the guest's computer and the server ), the server thinks that the client is still online, and the client thinks that the client is still online. This kind of program's false judgment on the real status may lead to many tragedies. For example, in this case, the client sends a command to the server, and the server has been waiting for the command because it has not received the command. The client assumes that the server has received the command, it is always waiting for the server to reply. If other parts of the program need to perform subsequent operations based on the current status, problems may occur because the program judges the current connection status incorrectly.

There is no doubt that the longer the judgment on the connection status error lasts, the greater the potential harm. Of course, if we do not take any additional measures, the server will be able to feel the client's disconnection at the end of the process. However, this time may have passed several minutes or even dozens of minutes. This is intolerable for most applications. Therefore, the remedy we need to take is to help the program obtain the TCP connection disconnection information as soon as possible.

First, we can use the socket. iocontrol method to set keepalivevalues on the socket to control the underlying TCP retention mechanism. For example, we can set a 2-second detection and throw an exception when the detection fails for more than 10 seconds. 

Byte [] Inoptionvalues = Fillkeepalivestruct ( 1 , 10000 , 2000 );
Socket. iocontrol ( Iocontrolcode . Keepalivevalues, inoptionvalues, Null );

 According to our experience, this setting can solve some problems, but some connections will still be perceived after being disconnected for more than 10 seconds. Therefore, this remedy is far from enough. We also need to add our own TCP connection status detection mechanism at the application layer, which is commonly referred to as "Heartbeat ".

2. "Heartbeat" Mechanism

 The heartbeat mechanism is simple: the client sends a heartbeat message to the server every n seconds. After receiving the Heartbeat message, the server returns the same Heartbeat message to the client. If the server or client does not receive any messages including heartbeat messages within M seconds (M> N), that is, the heartbeat times out, the target TCP connection is disconnected.

Because different applications have different sensitivity to TCP disconnection, the values of N and M can be set differently. The higher the sensitivity requirement, the smaller the N and M requirements. The lower the sensitivity requirement, the larger the N and M requirements. The higher the sensitivity, the more costly the Heartbeat message needs to be sent more frequently. If thousands of connections send heartbeat messages frequently at the same time, therefore, the resources it consumes cannot be ignored.

Of course, the quality of the network environment (such as the latency) also affects the setting of N and M values. For example, the network latency is large, then the difference between N and m should be larger (for example, M is 3 times of N ). Otherwise, a misjudgment may occur-that is, the TCP connection is not disconnected, but the Heartbeat message is not received in time due to the large network delay, but we think the connection has been disconnected.

Esframework has a built-in heartbeat mechanism. When the heartbeat times out, the server will trigger the someonetimeouted event of iusermanager to notify our application.

On the server side, usermanager detects heartbeatchecker through esbasic. Threading. application. heartbeatchecker, And the incluvespaninsecs attribute of heartbeatchecker can be used to set the m value we described.

On the client side, the esplus. application. Basic. Passive. heartbeater is used to periodically send heartbeater's detectspaninsecs attribute can be used to set N values.

When we use the rapid engine, the rapid engine has assembled the heartbeat components for us. Since the rapidserverengine and rapidpassiveengine do not expose heartbeatchecker and heartbeater, we cannot directly set M and N values through heartbeatchecker and heartbeater. However, rapidserverengine and rapidpassiveengine provideHeartbeattimeoutinsecsAttributes andHeartbeatspaninsecsProperty to indirectly Set M and N. 

3. The dropped TCP connection must be closed.

For normal disconnection (instant detection) or heartbeat timeout (not instant Detection), you must close the corresponding TCP connection to release system resources. The itcpserverengine interface provides the closeoneconnection method to close the target connection.

 
///<Summary>
///The someonedisconnected event is triggered when the connection is closed.
///</Summary>
VoidCloseoneconnection (useraddress adderss, disconnectedtype );

When a common disconnection occurs, itcpserverengine automatically closes the TCP connection. However, when the heartbeat times out, we need to manually close the corresponding connection. Fortunately, the components in the esplus. application. Basic Space will automatically help us close the connection with timeout and disconnection. Therefore, when using the rapid engine, we do not need to manually close the TCP connection that has timed out.

In addition, when the TCP connection times out and drops, the server that uses the rapid engine will first trigger the someonetimeouted event of iusermanager, then, the someonedisconnected event of iusermanager is triggered (because esplus is triggered when the closeoneconnection method is called ).

4. UDP and "Heartbeat"

The previous sections are about TCP disconnection. Let's take a look at UDP.

Because UDP is a connectionless protocol, when we use the esframework UDP engine, we almost certainly need to have a heartbeat mechanism. We use heartbeat messages to confirm that the client is still online, to ensure that the server will not release the corresponding session or keep the expired session for a long time.

Components related to the heartbeat mechanism in esframework are protocol-independent, so they can be used for both TCP applications and UDP applications.

InEsframework development manual (04)-reliable P2PIf the P2P channel described in this article is based on UDP, esplus also starts the heartbeat mechanism internally to ensure that esplus can perceive it as soon as possible when the UDP-Based P2P channel is disconnected, and close the corresponding P2P channel.

5. Disable heartbeat Mechanism

For example, in a distributed system that communicates in a LAN, the heartbeat mechanism can be disabled because of the low network latency and the chance of unexpected disconnection. For another example, when we debug a client program through a breakpoint, the server determines that the client has timed out and dropped its heartbeat. In this case, the heartbeat mechanism can also be disabled. So how do I disable the heartbeat mechanism? You can do this:

    • SetThe heartbeatspaninsecs attribute of rapidpassiveengine is set to 0. In this way, the client will not send timed heartbeat messages.
    • Set the heartbeattimeoutinsecs attribute of rapidserverengine to less than or equal to 0. This indicates that the server will no longer perform heartbeat timeout checks.

 

Read more esframework development manual articles.

Certificate -----------------------------------------------------------------------------------------------------------------------------------------------

 Download the free version of esframework and demo source code 

 

 

For any questions about esframework, please contact us:

 

Tel: 027-87638960

 

 

Q: 372841921

Email:Esframework@oraycn.com

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.