Zookeeper Architecture Design and key points of application

Source: Internet
Author: User
Tags md5 hash time in milliseconds

Question guidance:

1. What is the data model of zookeeper?
2. What are the traps of zookeeper applications?
3. What is stored in each node (znode?
4. What does a znode maintain a State structure?
5. What is the znode composition structure?
6. What is the watches mechanism?
7. What four built-in methods does zookeeper implement ACL?






Preface

Zookeeper is an open-source distributed service framework. It is a sub-project of the apachehadoop project and is mainly used to solve some problems in distributed application scenarios, such: unified Naming Service, status Synchronization Service, cluster management, and distributed application configuration management. It supports standalone and distributed modes. In distributed mode, it can provide high-performance and reliable coordination services for distributed applications, and zookeeper can greatly simplify the implementation of Distributed coordination services, greatly reducing costs for distributed applications.



Overall architecture zookeeper: <ignore_js_op>




A zookeeper cluster is composed of a group of server nodes. The group of server nodes contains a node with the role of leader, and all other nodes are follower. When the client connects to the zookeeper cluster and executes write requests, these requests are sent to the leader node. Then, data changes on the leader node are synchronized to other follower nodes in the cluster. After receiving a Data Change Request, the leader node first writes the change to the local disk for restoration. After all write requests are persisted to the disk, the changes will be applied to the memory. Zookeeper uses a custom atomic Message Protocol, which ensures the consistency of node data or status in the entire coordination system. Based on this message protocol, follower can ensure that the local zookeeper data is synchronized with the leader node, and then independently provide external services based on local storage. When a leader node fails, the failure is a rapid response. The message layer is responsible for selecting a new leader and continuing to serve as the center for coordinating the service cluster to process client write requests, and synchronize (broadcast) data changes of the zookeeper coordination system to other follower nodes.

 

Zookeeper balances and designs based on the following four objectives. We will describe the design and Its Features in detail:

 

Various processes in a simple distributed application can be coordinated through the namespace of zookeeper, which is shared and hierarchical, more importantly, its structure is simple enough to be easily understood like the directory structure of the file system we normally access: <ignore_js_op>



In zookeeper, each namespace is called a znode. You can understand that each znode contains a path and related metadata, as well as a list of children inherited from the node. Unlike traditional file systems, Zookeeper stores data in the memory, enabling high throughput and low latency of the distributed synchronization service. In the zookeeper Data Model in the example, the following points are provided:

 

Each node (znode) Stores synchronization-related data (this is the original intention of zookeeper design, with a small amount of data, about B to KB ), such as status information, configuration content, and location information. A znode maintains a State structure, including version number, ACL change, and timestamp. Each time the znode data changes, the version number increases, so that the client's read requests can retrieve status data based on the version number. Each znode has an ACL to restrict access to the znode. In a namespace, Read and Write Request operations on data stored on znode are atomic. The client can set a monitor on a znode. If the znode data changes, Zookeeper notifies the client to trigger logical execution in the monitor. When each client is connected to zookeeper, a session is established. During the session, three statuses may occur: connecting, connected, and closed. Zookeeper supports the concept of temporary nodes (ephemeralnodes). It is related to sessions in zookeeper. If the connection is disconnected, the node is deleted. The redundant zookeeper is designed as a replication cluster architecture. The data of each node can be copied and transmitted in the cluster, so that the data of each node in the cluster is synchronized and consistent, so as to achieve service reliability and availability. As mentioned above, Zookeeper puts data in the memory to improve performance. To avoid spof, it is essential to support data replication to achieve redundant storage.

 

Ordered zookeeper uses a timestamp to record the transaction operations that lead to state changes. That is to say, a group of transactions use a timestamp to ensure order. Based on this feature. Zookeeper can implement more advanced abstract operations, such as synchronization.

 

Fast zookeeper includes two types of read/write operations. It is a Distributed Application Based on zookeeper. If it is an application scenario with fewer reads and writes (the read/write ratio is about), The read performance is more efficient.



The data model zookeeper has a hierarchical namespace, which is similar to the directory structure of the file system and is very simple and intuitive. Znode is the most important concept, which we have described earlier. In addition, znode-related components include watches, ACL, temporary node, and sequence node ).



In the znode structure zookeeper, zxid (zookeepertransaction ID) is used to represent each node data change. A zxid corresponds to a timestamp, so the transactions corresponding to multiple different changes are ordered. The structure of znode is as follows:

 

Czxid-The zxid of the change that causedthis znode to be created. mzxid-The zxid of the change that lastmodified this znode. ctime-the time in milliseconds from epochwhen this znode was created. mtime-the time in milliseconds from epochwhen this znode was last modified. version-the number of changes to the dataof this znode. cversion-the number of changes to thechildren of this znode. aversion-the number of changes to the aclof this znode. ephemeralowner-the session ID of theowner of this znode if the znode is an ephemeral node. if it is not anephemeral node, it will be zero. datalength-the length of the Data fieldof this znode. numchildren-the number of children ofthis znode.



Watch in watches (Monitoring) zookeeper can only be triggered once. That is to say, if the client sets watch on the specified znode and the znode data changes, Zookeeper will send a change notification to the client and trigger the set Watch event. If the znode data changes again and the client does not reset the znode watch after receiving the first notification, Zookeeper will not send a change notification to the client. Zookeeper asynchronously notifies the client that sets watch. However, Zookeeper can ensure that the client is notified asynchronously after the change of znode takes effect, and then the client can see the data change of znode. Due to network latency, multiple clients may view znode data changes at different times, but the order of changes can be consistent. Znode can be configured with two types of watch. One is datawatches (the data change of this znode triggers the watch event), and the other is child watches (the child node of this znode triggers the watch event ). You can call the getdata () and exists () Methods to set data watches, and call the getchildren () method to set child watches. Call the setdata () method to trigger data watches registered on the znode. Call the CREATE () method to create a znode, which will trigger the data watches of the znode. Call the CREATE () method to create the child node of the znode, which will trigger the child watches of the znode. If you call the delete () method to delete a znode, both data watches and child watches are triggered. If the deleted znode has a parent node, the parent node triggers a child watches. In addition, if the client is disconnected from the zookeeper server, the client cannot trigger watches unless it establishes a connection with the zookeeper server again.



Sequence nodes (sequence node) when creating znode, you can request zookeeper to generate a sequence, with the path name prefix, counter followed by the path name, for example, will generate a sequence qn-0000000001 similar to the following form, qn-0000000002, qn-0000000003, qn-0000000004, qn-0000000005, qn-0000000006 for the parent node of znode, each counter string in the sequence is unique and the maximum value is 2147483647.



The ACLs (Access Control List) ACL can control nodes that access zookeeper. It can only be applied to specific znodes, but not all child nodes of the znode. It has the following five permissions:

 

Create allows the creation of child nodesread to allow the acquisition of znode data, and the node's Child List write can modify znode data Delete can delete a child node admin can set permissions

 

Zookeeper has four built-in methods to implement ACL:

 

World a separate ID, indicating that anyone can access auth without using ID, only authenticated users can access digest using Username: the MD5 hash value generated by password is used as the authentication idip and the client host IP address is used for authentication.

 

Zookeeper session a session is established when the client connects to the zookeeper cluster. Status changes during the session: <ignore_js_op>



When a connection is established, the session state is ing. After the connection is established successfully, the session state changes to connected. If the session is normal, the session state can only be one of ing and connected. If the connection is disconnected during the session, the status changes to closed.



Application traps are not suitable for any distributed application to use zookeeper to build coordination services. Based on the documents provided by zookeeper, we provide the situations in which problems will occur during use and how to deal with these problems. Summary:

 

After the change notification client on znode is lost and connected to the zookeeper server, a TCP connection is maintained. In the connected status, the client sets the watch listener for a znode and receives a notification from the node change (a logical execution process will be triggered later ). However, if the client is disconnected from the zookeeper Server due to a network exception, the node data change notification sent by zookeeper on znode cannot be received during the disconnection. Therefore, if you use the zookeeper watch, you must find a watch that maintains connected to ensure that the data change notification on the znode monitored by the watch is not lost.

 

When the zookeeper cluster node list is invalid and interacts with the zookeeper cluster, the client generally holds a list of zookeeper cluster nodes or a subset of the list. There are two situations: one case is that if the list or list subset held by the client, in which nodes are active, can provide coordination services, the client can access the zookeeper cluster without any problems. In another case, the client holds the zookeeper cluster node list or list subset. If some nodes in the list exit the cluster due to a fault, if the client connects to this type of invalid node again, you cannot obtain the service. Therefore, when using a zookeeper cluster in an application, we must clarify this point, skip invalid nodes, find valid nodes, continue service processing, or check the zookeeper cluster, restore the entire cluster to normal.



If the Java heap memory is set improperly, the zookeeper memory is insufficient and data exchange is performed between the memory and the file system. As a result, the performance of zookeeper is greatly reduced, this may affect applications. To avoid swapping problems, we mainly need to set enough Java heap memory, reduce the memory used by the operating system and cache, and avoid data exchange between the memory and the file system, or you can limit the exchange to a certain extent.

 

The performance of the transaction log storage device zookeeper will synchronize the transaction to the storage device. If the storage device is not dedicated, but shares the same disk with other I/O-intensive applications, Zookeeper efficiency will be caused. Because the client requests a change to the znode data, Zookeeper writes the transaction logs to the storage device before the response. If the storage device is dedicated, the performance of the entire service and external applications will be greatly improved.



Znode stores a large amount of data, causing performance problems. zookeeper is designed to store only a small amount of Synchronized data. If a large amount of data is stored, this causes the zookeeper to write transactions to the storage device each time the node changes, and also replicate and spread the transactions within the cluster, which will lead to inevitable latency and performance problems. Therefore, if you need to be related to a large amount of data, you can store a large amount of data in other devices, instead of simply storing a simple ing in zookeeper, such as pointers and references.

Article transferred from: http://www.aboutyun.com/thread-7731-1-1.html

Zookeeper Architecture Design and key points of application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.