Hadoop learning notes (2) -- use and analysis of zookeeper

Source: Internet
Author: User
Tags zookeeper client

The distributed architecture is a centralized design.Master control machines connected to multiple processing nodesTherefore, it is critical to ensure the high availability of the master machine. Distributed locks are a good solution to this problem.Zookeeper is a Distributed Lock Management System for Highly Reliable metadata maintenance..

 

I. Application

1. Cluster Mode

In cluster mode, multiple zookeeper nodes are configured to start the zookeeper cluster. zookeeper will vote for a node to obtain the Distributed Lock Based on the configuration.

Key configuration example:

# The cluster servers
# Server.1 = 192.168.1.10: 2887: 3887
# Server.2 = 192.168.1.11: 2888: 3888
# Server.3 = 192.168.1.12: 2889: 3889

Several zookeeper nodes configured above will vote for each other until a leader is elected and the other nodes are follower.

2. standalone Mode

If no configuration information is specified, the current node is automatically selected as the zookeeper master node at startup.

 

Use zookeeper:

Obtain the zookeeper client. zookeeper provides both C and Java client access interfaces. The main framework is implemented in Java. For example, the Java client needs to specify the zookeeper connection address and port number for the calling program. During instantiation, the client automatically creates a session and connects to the zookeeper cluster. The Code is as follows:

Zookeeper zookeeper = new Zookeeper (IP: Port, timeout, null );

 

The client that obtains zookeeper in the preceding method can perform zookeeper operations. For example, check whether the node exists. If the node does not exist, create the node and assign a value to it:

Stat stat = zookeeper. exists (path, false );

Byte [] bytes = values;

If (STAT = NULL)

{

Zookeeper. Create (path, bytes, IDs. open_acl_unsafe, createmode. Persistent );

}

Else

{

Zookeeper. setdata (path, bytes,-1 );

}

Delete a node:

Zookeeper. Delete (path, version );

Update the data of a node:

Byte [] bytes = values;

Zookeeper. setdata (path, bytes,-1 );

 

Ii. Voting algorithm

In cluster mode, the following configurations are available:

# The cluster servers
# Server.1 = 192.168.1.10: 2887: 3887
# Server.2 = 192.168.1.11: 2888: 3888
# Server.3 = 192.168.1.12: 2889: 3889

Each zookeeper node has a unique ID. In the configuration file, you can specify which voting algorithm is used. The highest priority is to compare the transaction sequence first. when sending a vote to other nodes, the voting ID and transaction sequence must be sent out. The transaction sequence is compared first, and the largest transaction sequence is selected first, if the transaction sequence is the same, the largest selected ID is considered as the leader. Other nodes are selected as the follower, and the operation data is dominated by the leader. Other nodes use merge from the leader.

Key code:

If (newzxid> curzxid) | (newzxid = curzxid) & (newid> curid ))

Return true;

Else

Return false;

 

At startup:

1. Each node sends a vote to all nodes in the cluster, including the sending Node itself. It is recommended that you vote for the master machine.

2. select and recommend the largest ID in the ticket to vote for your node. save the vote sent by each node locally, calculate the maximum number of votes for the ID, and calculate the number of votes for the ID. If more than half of the votes are sent, the voting is terminated and the ID is the master, and send notifications to other nodes. If no more than half of them are sent, they will send their recommended votes to each node and repeat the second step.

Iii. Metadata

1. Metadata Data Model

2. Operation

In zookeeper, the organizational structure of metadata is tree-like. You can use metadata in the form of/A/B/C. It provides create, exists, delete, getdata, setdata, and getchildren.

Create: Create a node

Exsits: Check whether a node with the specified name exsits

Delete: deletes a node.

Getdata: obtains the value of a node and returns a byte stream.

Setdata: assign a value to a node. The parameter is a byte stream.

Getchildren: Get the subnode under the specified Node

 

3. Metadata Storage

In terms of the overall architecture, Zookeeper is a distributed file system for storing metadata. However, the data stored on each node of zookeeper is an independent and complete copy of all the data in the system, data on the follower node is backed up on the leader node.

 

After determining the leader and follower, the data will operate the leader and back up the data to the follower. In this way, even if the leader fails, the leader can be automatically elected from the follower again, at the same time, the data is also up-to-date, which guarantees availability to the greatest extent possible.

 

4. Communication Protocols

In zookeeper voting, different nodes need to communicate and vote, and the voting does not need to confirm whether each vote is received by the other party, and in order to improve the voting efficiency, save time, the voting communication method is UDP, for example:

Byte responsebytes [] = new byte [48];

Bytebuffer responsebuffer = bytebuffer. Wrap (responsebytes );

Datagrampacket responsepacket = new datagrampacket (responsebytes, responsebytes. Length );

Mysocket = new datagramsocket (port );

 

5. Maintenance of zookeeper node configuration information-JMX

To facilitate zookeeper node management, Zookeeper uses JMX for node management. For example:

Register a node in mbeanregistry: register the node to implement the zkmbeaninfo interface:

When you need to modify or call node information, you can directly obtain it from the local JMX.

 

 

 

Will be updated constantly in the future ......

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.