Curator Source code parsing (iv) Zookeeper existing connectivity problems

Source: Internet
Author: User
Tags zookeeper client

It is said that curator connection mechanism is very good, so in the analysis of curator connection and retry mechanism, I would like to first understand the original zookeeper connection problems.

Here is my review of the results of the summary, reproduced please specify the source: Jiq Technical Blog

Although curator provides a so-called high-level abstraction API to simplify the use of zookeeper, but more importantly encapsulates the complexities of managing connections to zookeeper clusters and retry mechanisms, let's examine in detail what curator is doing in this area, But before we do that, we need to figure out what zookeeper is currently doing in connection.



Session creation

When you create a Zookeeper object, the client establishes a session with the Zookeeper server (note that the session is thread safe, that is, multiple threads can share a zookeeper instance), This session will have a time-out period, that is, the expiration time, passed through the zookeeper constructor, the client will constantly send the heartbeat (how often to send and set the session timeout time) to the zookeeper server, To maintain an effective connection with the Zookeeper server and the validity of the session.

Connection lost

If the client is disconnected from the service-side network, or if the client-connected zookeeper server is hung up, or if the connection to the server's session has not been established, the Connection_loss phenomenon will occur. All watcher clients receive a disconnected event, and the client connection state changes from connected to connecting.

Automatic re-connect

The client library automatically selects a server from the list of zookeeper servers to be re-connected.

A. If you successfully establish a TCP connection to the server and do not exceed the timeout period of the session, then the zookeeper client will receive a syncconnected event and the client connection status will change to connected. The connection is restored to normal and the temporary node and registered watch events are not deleted. Even if the re-connection is completed in a short time, two events are received.

B. If a TCP connection to the server is not successfully established after a long time, the client will remain in the disconnected state and will never receive the expired event, only disconnectedevent (because the event is from the server).

C. If the TCP connection to the server is successfully established, but the discovery has exceeded the timeout period set by the session, then the client will receive a EXPIRED event indicating that the session has been terminated session_expired At this point the server will delete all the watcher that the client registered, as well as the temporary nodes created, and the zookeeper handle that the client holds will be closed, and the only thing that can be done is to reconstruct the Zookeeper object. The watcher that occurs session_expired will see the following state transitions:

' Connected ': Session establishment, normal communication between client and zookeeper cluster

.... Client is partitioned from the cluster

' disconnected ': Client loses connection to zookeeper cluster

.... Time elapses, a gradual lapse, after the ' timeout ' time limit zookeeper the cluster will terminate this session, at this time the client in disconnected state will not see anything.

.... Time elapses, the client re-establishes the connection to the zookeeper cluster

' expired ': End client re-connects to zookeeper cluster and will receive expiration notification

Processing of lost connections:

Connection_loss means that the client and server side of the TCP connection is disconnected, but does not mean that the request failed. If a create request is being executed and then the connection is broken before the request arrives at the server and response returns, the Create request executes successfully and the CREATE request fails if the packet is disconnected before it is sent to the line. Unfortunately, the client has no way of knowing whether the request was successful after Connection_loss, and the developer had to detect whether the execution was successful or not and whether it needed to be retried, including checking whether the corresponding Znode existed, or whether the value of the Znode node had been modified.


Processing of Session Termination:

Session_expired will automatically close the zookeeper handle, if the correct operation zookeeper Cluster, the session termination phenomenon is difficult to appear, if the client forced to close a connection is bound to appear this event, because the server thinks the client is dead. What if there is a session failure to do so?

The following article will analyze to see if curator has resolved these issues:

(1) The processing of the Connection_loss when the request is made, whether it is able to know that the requested request has been successfully executed, and if it is unsuccessful to retry.

(2) whether the temporary node and the registered watch will not be deleted when the session_expired is processed.


Curator Source code parsing (iv) Zookeeper existing connectivity problems

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.