One of Zookeeper series-zookeeper Getting Started

Source: Internet
Author: User
Tags apache solr

What is zookeeper?

Zookeeper, The Incredibles animal Administrator, is the administrator of the Elephant (Hadoop), the Bee (Hive), the Pig (pig), Apache HBase and Apache Solr, and LinkedIn Sensei and other projects have been adopted to the zookeeper. Zookeeper is a distributed, open source distributed Application Coordination Service, zookeeper is a distributed application based on the fast Paxos algorithm that implements synchronization services, configuration maintenance, and naming services.

How does zookeeper work?

Zookeeper is a higher level of synchronization (synchronization), configuration Management (config maintenance), group (groups), and name Services (naming) that are built as distributed applications. In turn, the zookeeper design is very simple, the data model used is very similar to the file system directory structure, simply, a bit like the structure of the registry in Windows, there is a name, there is a tree node, there is a key (key)/value (value) pair relationship, can be seen as a tree structure of the database , distributed on different machines to do name management.

The zookeeper is divided into 2 parts: the server side and the client, and the client is only connected to one of the servers throughout the Zookeeper service. The client uses and maintains a TCP connection that sends requests, accepts responses, gets observed events, and sends heartbeats through this connection. If this TCP connection is interrupted, the client will attempt to connect to another zookeeper server. When the client connects to the zookeeper service for the first time, the zookeeper server that accepts the connection establishes a session for the client. When the client connects to another server, the session is re-established by the new server.

After starting the Zookeeper server cluster environment, multiple zookeeper servers will elect a leader before work, and in the next work the elected leader dies, and the remaining zookeeper servers will know that the leader is dead. A leader will continue to be selected in the Zookeeper cluster, and the purpose of the leader is to ensure the consistency of the data in a distributed environment.

In addition, zookeeper supports the concept of Watch (watch). The client can set an observation on each znode node. If the Zonde node of the observed server is changed, watch will be triggered, and the watch's client will receive a notification packet that the node has changed. If the client and the connected zookeeper server are disconnected, other clients will also receive a notification, saying that a zookeeper server can be for multiple clients, and of course multiple zookeeper servers can be used for multiple clients. You can also check the command to see which zookeeper server node is leader, which is follower.

I through the experiment observed that the zookeeper cluster environment preferably has more than 3 nodes, if only 2, then 2 of all kinds of machines, regardless of which machine down, will only have a leader, then if there are clients connected up, will not work, And the remaining leader servers are constantly throwing exceptions. And the client connection will also throw such an exception, indicating that the connection is denied, and wait for a socket to connect a new connection, where the new socket connection refers to a zookeeper in the follower.

Remember that in about 2006 Google out of chubby to solve the problem of distribution consistency (distributed consensus problem), all the servers in the cluster through chubby finally elected a master server, Finally, this master server coordinates the work. In short, the principle is: in a distributed system, a group of servers running the same program, they need to determine a value, the information provided by that server is primary/quasi, when the server after the N/2+1 way to choose, All the process on the machine is notified that the server is the master server, and the information provided by him is the subject of the data. Very want to know Google chubby in the mystery, but others Google does not open source, from home.

But in 2009 3 years after the long-silent Yahoo on Apache withdrew similar product zookeeper, and in Google's original chubby design ideas have made some improvements, because Zookeeper does not fully follow the Paxos protocol, Instead of a 2 phase commit protocol based on its own design and optimization, zookeeper, like Chubby, is used to store some collaborative information (coordination), which is less generally than 1M, stored in zookeeper in the form of a hierarchical tree, these specific key/value information is in the store in tree node. When an event causes node data to change, for example: change, increment, delete, zookeeper will call the Triggerwatch method, determine if the current path has a corresponding strong (watcher), if there is a watcher, will trigger its process method, Executes the business logic in the process method.

One of Zookeeper series-zookeeper Getting Started

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.