One of the Zookeeper: function and Essence

Source: Internet
Author: User
Tags zookeeper

Zookeeper (ZK) is no stranger to us. However, it is not necessary to see many students ' understanding of zookeeper is too stylized, and some places even need to be back. Understand the essence, conceptual and functional introduction can be pushed out, and the structure to ingenious, through the phenomenon to see the essence, can have a great upgrade of technology and technical understanding. Let's look at the function and nature of ZK.

Definition and use of zookeeper

Let us first understand the official definition.

Apache ZooKeeper is a effort to develop and maintain an Open-source server which enables highly reliable distributed coor Dination.

Apache ZooKeeper is an open source server dedicated to developing and maintaining a highly available distributed coordination framework for the server.

ZooKeeper is a high-performance coordination service for distributed applications. It exposes common Services-such as naming, configuration management, synchronization, and group services-in a simple I Nterface so don't have the to write them from scratch.

Zookeeper is a high-performance Distributed Application Coordination Service framework. It implements a series of generic services in the form of simple interfaces, such as * * naming, configuration management, synchronization, grouping, etc., so you don't have to implement them from a bunch of drafts.

The essential function of zookeeper

Through the official definition of the introduction, we know that ZK is a server, good at distributed coordination functions. Let's analyze how the nature of the function is implemented.

ZK's data model is stored and organized in the form of Znode. Similar to the standard file system, it is a tree structure with the root node being '/'.

Each node in the diagram is a znode, similar to a file in a file system, forming a tree structure that can store no more than 1M of data inside each znode. These znode can be long-lasting and can be short-term (ephemeral).
Short-term (ephemeral) Znode when creating his client session timeout, will be actively deleted by ZK. A bit similar to the file lock, the process exits unexpectedly after the lock is immediately lifted.

ZK's data model is similar to a file system, which is nothing special. In the form of KV, if the value of KV is also required to be in KV format, then it is the same as ZK's data model. Represents a tree-like format that makes it easier to represent hierarchical relationships.

The special purpose of ZK is:

    1. the selection and writing data mechanism within ZK. There are more than half of the ZK cluster nodes selected by the main node, become the leader node of the cluster, responsible for the main write and synchronization of the other bundles belong to the follower node. The Zab (ZooKeeper Atomic Broadcast) protocol for the underlying.
    2. Short-term (ephemeral) Znode function. easy to implement lock class operation, in the distributed processing time-out state.
    3. The client can set monitoring watch a Znode function , when the Znode changes (version number changes), will actively notify Watch's client has changed. This feature allows the client to be aware of the order of Znode changes without polling.

Naming, configuration management, synchronization, grouping and other functions are achieved through the combination of 1, 22 points. Our self-developed business can be thought of, or implemented in a similar way, if it is to be realized.

The selection and writing data mechanism within ZK. It is not so easy to think, can only rely on the paper to achieve. So this is more to learn, this method is very characteristic, and it is not easy to think out, it is not easy to understand.

Difference from existing self-developed business

In the self-research business, the realization of the ZK function, more like the Configuration Center (hereinafter referred to as CC).

The general implementation of CC, the use of a master multi-slave, the master node is responsible for writing, from the node read-only. The master node synchronizes from the node through the Binlog to ensure eventual consistency.

The master node has two write data paths:

1, update the configuration table through the configuration center of the management station;

2, through the Client API Escalation service status, Update client node load and health status.

3, the heartbeat and change back to the package as a protocol to notify the client configuration updates.

If the panic from the node does not affect the Cluster service, the corresponding client is looking for a new slave node to read.
If the primary node freezes, CC only provides read services, to be manually restored.

Impact: The load on configuration and configuration items cannot be added or modified during a failure.

If you use ZK to implement CC, the normal operation mode and CC are the same. But when the main crashes, the algorithm will be used to re-elect the master, transparent to the client. reduce the probability of the primary node to stop writing. But if half of the nodes die, the entire ZK cluster will be unavailable.

Contrast:
| | Self-Research CC | ZK |
|--------|--------|--------|
| Master Mode | Manual configuration, the main dead cluster read-only, manual intervention Recovery | Cluster Negotiation Select master, continue service since recovery
| The cluster is completely unavailable | All nodes are dead | Half of the nodes are dead (there may be zoning problems causing the ZK to have internal synchronization problems, but the nodes can be serviced)

ZK chose the main way, and did not triumph over a master from all the scenes.

    1. The algorithm is more complex and not easy to understand and implement.
    2. Some important tasks, the main write problem, in order to be consistent, to manually intervene in the recovery, automatically select a new primary switch will cause data loss.
    3. For the characteristics of business-specific scenes, make some remedy scheme, can reduce the risk of single-point main write. For example, set up multiple sets of CC, parallel write, all external services, because the configuration node health and the load of small inconsistencies, is acceptable to the business. You can also increase the cache in your business to ensure that you have enough time to recover from the death of the host.

The above self-research business does not introduce the Zab or Paxos agreement reason. After the advent of ZK, the business you want to use can directly build the cluster node selection function on the ZK.

Precautions

When we build services on ZK, we should pay attention to the characteristics of the dead of the complete group of the half-node of ZK. Consider that if the ZK cluster does not serve, the business has an alternative, able to serve the external efforts . For example, ZK acts as the configuration center, the client sets the cache, or the default configuration.

To conserve resources, the ZK cluster must be an odd set of machines. However, the number of machines in ZK is much higher, which will have a great effect on performance. Write data synchronization and selection of the master will become more and more slow.

Workaround:

    1. Read more write Less: Increase the observer node to extend read performance. Observer nodes do not participate in the negotiation and election of master and slave nodes, only synchronize the primary node.
    2. Read less Write more: According to the characteristics of the Business division set, to achieve parallel expansion.
Summarize

Zookeeper through the ZAB protocol, the internal and write synchronization functions of the cluster are realized, which improves the robustness of the service and the order of the writing operation. is the difficulty of realization, behind the rigorous mathematical theory reasoning.

Through the realization, the short-term (ephemeral) Znode and the active notification node change message function, the client can know the monitoring node change in time, after the client and ZK disconnects, also can automatically release the node. Easily implement lock-type services and monitor update class requirements. These are the basis for implementing services such as name services, configuration management, synchronization, grouping, and so on.

One of the Zookeeper: function and Essence
Www.owenzhang.net/blog/121.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.