In zookeeper, the number of nodes is theoretically limited by memory, but the number of subnodes under a node is limited by the size of data (number of znodes)
The watch mechanism of zookeeper is used for active notifications of zookeeper when data is changed. Watch can be appended to each node. If an application has 10 million nodes, there may be 10 million watches (or even more) in zookeeper ). Every time the zookeeper completes the rewrite node operation, it checks whether there is a corresponding watch. If yes, it notifies the watch. Zookeeper-watcher mechanism and asynchronous calling Principle
This article focuses on the following:
- Whether the performance of zookeeper is affected by the number of nodes
- Whether the performance of zookeeper is affected by the number of watches
Test Method
Deploy a zookeeper on each of the three machines. The version is3.4.3
, Machine Configuration:
Intel(R) Xeon(R) CPU E5-2430 0 @ 2.20GHz16Gjava version "1.6.0_32"Java(TM) SE Runtime Environment (build 1.6.0_32-b05)OpenJDK (Taobao) 64-Bit Server VM (build 20.0-b12-internal, mixed mode)
The default JVM heap size is used in most experiments, that is1/4 RAM
:
java -XX:+PrintFlagsFinal -version | grep HeapSize
The test client uses ZK-smoketest, which is written by myself for the watch test. Based on ZK-smoketest I wrote some scripts to automatically run the data and extract the results, the related scripts can be found here: https://github.com/kevinlynx/zk-benchmark
Impact of the number of nodes in the test result on read/write Performance
The test has a maximum of nodes and measures the number of operations within 1 second (OPS ):
The increase in the number of visible nodes does not affect the read/write performance of zookeeper.
Influence of node data size on read/write Performance
The Internet has come to a conclusion. The larger the data of a single node, the impact on the network throughput. Therefore, the larger the data, the lower the read/write performance is expected.
Write Data is synchronized in the zookeeper cluster, so the overall speed is slower than reading data. This experiment requires a certain increase in the timeout time, and I also adjusted the maximum JVM heap size to 8 GB.
The test results are obvious. The node data size seriously affects the efficiency of zookeeper.
Impact of watch on read/write Performance
The latency test provided by ZK-smoketest has a parameter.--watch_multiple
This parameter is used to specify the number of watches, but only the number of clients.echo whcp | nc 127.0.0.1 4181
You will find that each node still has only one watch.
In my write test, multiple watches on a single node are simulated by creating multiple clients. This is more suitable for practical applications. At the same time, writing to the node is also in another independent client, which can avoid interference from the implementation of zookeeper client to the test.
For each complete test, the first step is to add the watch of node data to each node, and then rewrite the data of these nodes in another client to collect the time consumption of these rewrite operations, to determine the impact of the added watch on these write operations.
In the figure,0 watch
No watch is added to the node;1 watch
A client watches each node;3 watch
The other three clients watch each node, and so on.
It can be seen that watch has a great impact on write operations. After all, network transmission is required. Similarly, it is shown that the number of watches for the entire zookeeper is the same as the number of nodes, which does not affect the overall performance.
Overall conclusion
- Operations on a single node are not affected by the total number of nodes in zookeeper.
- The data size has a great impact on the performance of zookeeper. Both Performance and memory
- The number of watches of independent sessions on a single node has a certain impact on performance.
Address: http://codemacro.com/2014/09/21/zk-watch-benchmark/
Written by Kevin Lynx posted athttp: // codemacro.com
Zookeeper node count and watch performance test