Zookeeper: the Administrator of the Chinese translation zoo. It is a service framework used to coordinate distributed programs. The reason why zookeeper is called is that the author regards these distributed system coordination tasks as a zoo administrator.
- Design Objectives
- Simple operation: zookeeper is mainly used to coordinate and process distributed tasks through a multi-level namespace, which is similar to a file system. A namespace is a data register called znode. According to zookeeper, a multi-level namespace is the relationship between files and directories. However, the zookeeper data is stored in the memory.
- Self-replication: like the following distributed system, Zookeeper mainly replicates data between servers. zookeeper services must know each other's existence. They will maintain the same memory image through transaction control. As long as most zookeeper machines can run, Zookeeper can provide normal services. When a client needs a service, it can be linked to a server through TCP. If the server fails, it will automatically link to another server.
- Ordered: zookeeper updates a counter to reflect the transaction sequence of zookeeper. Sub-operations can use this counter to achieve high-level abstraction, such as primitive synchronization.
- Fast: the performance of zookeeper is more powerful, especially when reading, because zookeeper service can be provided by many machines at the same time and read speed is 10 times higher than the write speed;
- Data Model and multi-level namespace
Zookeeper multi-level namespace: The Zookeeper namespace is like a common file system. The path of each node is unique, as shown below:
- Node
- Unlike standard file systems, Zookeeper is associated with information related to this node and Its subnodes. zookeeper is designed to store coordination information, such as status information and configuration, zookeeper Location Information)
- Conditional update and observation
Zookeeper supports the concept of observation. The client can set the observation post for a node. If the node changes, the observation event is triggered and the observation event is removed. When an observation post finds that a node has changed, the client will receive a message packet saying that the node has changed. If the client and server cannot establish a connection, the client will receive a local Message notification.
- Data Guarantee Mechanism
- Zookeeper runs very quickly and easily. Despite the design goal, it is the basis for constructing complex services. For example, asynchronous mode can provide many assurance mechanisms.
- Ordered consistency-update requests from the client will be sent in order
- Atomicity-the update result is either successful or failed, and there is no other possibility.
- Single System Image-the client will see the same result no matter which server it connects
- Reliability-once an update is applied, it will continuously overwrite the client.
- Timely-the client receives the message in a timely manner
- Simple API
- One of the goals of zookeeper design is to provide a very simple program interface. As shown in the following result, it only supports the following operations:
- Create
Creates a node at a location in the tree
Delete
Deletes a node
Exists
Tests if a node exists at a location
Get Data
Reads the data from a node
Set Data
Writes data to a node
Get children
Retrieves a list of children of a node
Sync
Waits for data to be propagated
Zookeeper presents its own high level of component services. When some request exceptions are allowed, each server that makes up the zookeeper service will replicate each other.
Zookeeper Components
Copying data maintains the entire data tree in the memory. The updated log is recorded on the sub-disks so that the database is serialized to the disk before they are updated in the memory, each zookeeper server provides services for several clients. The client connects to a server and submits requests. The database that provides services is a local service, data requests and write requests whose statuses change follow a protocol called uniform opinions.
The unified opinion protocol is like this. All write requests are sent to a unified server, which can be called the boss and the other zookeeper servers can be called younger brother. The boss posted the information, and the younger siblings received the information for synchronization. The information transfer layer value is concerned about whether the younger siblings have successfully updated the information, and whether the data is consistent with the data of the elders.
Zookeeper uses a common atomic information transfer protocol. Because of its atomicity, Zookeeper can ensure that the data on each younger brother is correct. When the boss receives a request for writing data, it calculates the state of the entire system. When the data is copied to each younger brother, it acts as a thing and captures various States.
The Zookeeper interface is designed to be simple, but you can also implement high-level operations, such as Asynchronous primitives and groups. Some distributed programs have started to do this.
Zookeeper is designed with high performance. The following are the results of the Yahoo development team's research. In particular, Zookeeper delivers higher performance when compared to reading and writing, because write operations synchronize data statuses on all zookeeper servers.
Zookeeper throughput as the read-write ratio varies
The following is the behavior record when zookeeper runs on 7 servers and an error occurs. We run the saturation test, which is 30% of the write speed. Below is a conservative data test.
Reliability in the presence of errors
The figure above shows some important observations. If the synchronization fails and is immediately restored, Zookeeper can also maintain a very high throughput. More importantly, the algorithm selected by the boss allows the entire system to quickly recover without affecting the throughput. Zookeeper does not take more than 200 ms when a new or old instance is selected. Third, Zookeeper can immediately process the request after recovery.
Zookeeper has been successfully run in many commercial projects. In Yahoo!, it is used for coordination and Failure Recovery
Apache zookeeper overview Translation