Article from: https://examples.javacodegeeks.com/enterprise-java/apache-hadoop/apache-hadoop-zookeeper-example/
= = = Article using Google Translator=====google translation: suggest first read the original.
In this example, we'll explore the Apache zookeeper, starting with the introduction and then the steps to set up the zookeeper and make it run. 1. Introduce
The Apache zookeeper is a building block for distributed systems. When designing distributed systems, it is always necessary to develop and deploy things that can be coordinated through the cluster. This is zookeeper into the picture. It is an open source project maintained by Apache for maintaining and coordinating distributed clusters. Some of the services provided by Zookeeper include:
Naming service: The name service is used to map names to some kind of data and can then be accessed using this name. For example, the DNS server maps to the IP address of the server, and the client can then use that URL name to access the server. In distributed systems, we may need to check the status of a server or node using the name assigned to them. This can be done by using the named Service interface provided by default by zookeeper.
Configuration management:Zookeeper also provides the option to centrally manage distributed system configurations. Configurations can be centrally stored on zookeeper, and any new node that joins the distributed system can select the configuration from the zookeeper. This makes managing configurations easy and free.
leadership elections: distributed systems often require an automatic failover strategy to prevent some node failures. Zookeeper provides options for doing so using the leader selection feature.
Locking: There will be some shared resources in each distributed system, and multiple services may need access to this resource. Therefore, in order to allow serialization access to this resource, a locking mechanism is required. Zookeeper provides this functionality.
Sync: access to shared resources also needs to be synchronized in distributed settings. Zookeeper also provides a simple interface. 2. How zookeeper works.
Zookeeper follows the client-server model. Where the client is a computer in the cluster. These machines are also called nodes. These clients use the services provided by the server. Zookeeper coordinates the distributed system, but it is also a distributed system itself. The Zookeeper server collection in distributed mode is called the Zookeeper collection.
At any given time, a client can connect to only one zookeeper server, but each zookeeper server can handle multiple clients ' time. The client periodically sends a ping (heartbeat) to the server to make it aware that it is active and connected to the server. The zookeeper server also responds to a confirmation notification that it is still alive and connected. The frequency of these ping/heartbeats can be set in the configuration file, which we will see in the next section.
If the client does not receive confirmation from the server to which it was connected within a specified time period, the client then attempts to connect to another server from the pool, and on a successful connection, the client session is routed to the new zookeeper server, where it is connected.
Zookeeper follows a filesystem-like hierarchical system that stores data in a node, which is called Znode. The znode originates from the Zookeeper data node. Each znode as a directory, can have multiple child nodes, and the hierarchy continues. In order to access the Znode,zookeeper follow the file path similar to the structure. For example, the path to Znode Firstnode and the corresponding child nodes can look like this:/firstnode/sub-node/sub-sub-node 3.Zookeeper Settings
In this section, we will experiment by setting up the zookeeper server on the localhost. Zookeeper provides a single server in the package that can be run directly on the machine.
3.1 system requires JAVA,JDK 6 or later (we will use JDK 8) the minimum 2GB RAM dual-core processor Linux operating system. Linux is supported as a development and production system. Windows and MacOSX only support as a development system, not as a production system.
3.2 Installing Java
First, we'll check if Java is installed on the system, and if not, we need to install Java first. To check to see if Java is installed, use:
Java-version
If this returns the Java version number, Java is installed. Make sure it is at least JDK 6 or later. If Java is not installed, we must install it first. Use the following command to install Java JDK 8.
sudo apt-get update
sudo apt-get intstall openjdk-8-jre-headless
The first command updates all installed packages, and the second command installs OPENJDK 8. The following is the console output we obtained after running the above command:
To check if the installation was successful, run the command again:
Java-version
3.3 Download Zookeeper
The next step is to download a stable version of zookeeper from the Resease Web site. Manually download the stable version from the download section of the publishing site (the stable version is 3.4.6 at the time of this writing). We can use any of the mirrors mentioned in the website (as shown in the screenshot below) and unzip/extract to the desired folder.
or use the following command to download and decompress:
wget http://www.eu.apache.org/dist/zookeeper/stable/zookeeper-3.4.6.tar.gz
TAR-XVF zookeeper-3.4.6.tar.gz
CD zookeeper-3.4.6/
3.4 Data Directory
Next, we need a directory to store data related to Znode and other zookeeper metadata. To do this, we will create a new directory by name zookeeper in/var/lib/
sudo mkdir/var/lib/zookeeper
cd/var/lib
ls
When using sudo to create this directory, it defaults to using root as the owner, and we need to change to the user zookeeper will run so that the zookeeper server can access the directory without any trouble. To change the user, run the following command from the folder/var/lib:
Cd/var/lib
sudo chown raman:zookeeper
Note: There is a space between ":" and zookeeper. Here we refer only to the Raman user as the owner of the directory, without the user group (usergroup:). So it assigns the user's default user group to the directory zookeeper.
To ensure that the owner has changed, go to/Var/lib/zookeeper the properties of the directory and check the permissions. It should be assigned to the users we set up:
3.5 Configuration Files
It is time to make the necessary changes to the configuration of the zookeeper server. It already contains the sample configuration file that we will use as a template. The sample configuration file is located in the folder zookeeper-3.4.6/conf/, and is named Zoo-sample.cfg
First let's rename the file to Zoo.cfg. The name of the file does not matter, but there should be only one. cfg file in the Conf folder.
CD zookeeper-3.4.6/conf
mv Zoo-sample.cfg zoo.cfg
Now, let's edit this zoo.cfg file. In this example, we use the Nano editor, but you can use any editor you like.
Nano zoo.cfg
Make sure the file looks like the screenshot below and contains the following settings:
Ticktime =
initlimit=10
synclimit=5
datadir=/var/lib/zookeeper
clientport=2181
Note: DataDir should be set to the directory we created in the previous step, i.e./var/lib/zookeeper
Let's briefly outline the implications of these configuration settings:
Ticktime:zookeeper detects all system nodes to check if all nodes are alive and connected. Inittime: The number of ticks that can be used during the initial synchronization phase. Synctime: The number of tick that can be passed between the request being sent and the acknowledgement being obtained. DataDir: directory where the memory database snapshots and transaction logs are stored by zookeeper. ClientPort: The port that will be used for client connections. 3.6 Start the server
It's time to start the zookeeper server. Zookeeper a script file to facilitate the startup of the server. This file is called zkserver.sh. So to start the server, use the following code:
CD zookeeper-3.4.6/
bin/zkserver.sh start
It should display a console output similar to the following screenshot:
4. Zookeeper Server basic interaction 4.1 start CLI
Once the zookeeper server runs successfully, we can initiate the CLI (command-line interface) to interact with the server. Use the following command:
CD zookeeper-3.4.6/
Bin/zkcli.sh-server
Using this command, the console will enter the zookeeper command-line mode, and we can interact with the server using zookeeper specific commands.
4.2 Creating the first Znode
Let's start with creating a new node. The following is the zookeeper command to create a new znode with virtual data.
Create/firstnode Helloworlddummytext
Here, Firstnode is the name of the Znode that will be created on the root path, such as/represents the root path, and Helloworlddummytext is the virtual text stored in Znode memory.
4.3 Retrieving data from the first Znode
Similar to how we create a new znode, we can use the CLI (command-line interface) to get znode details and data. The following is a command to obtain data from Znode.
Get/firstnode
If you notice in the screenshot, along with the data we stored in Znode at the time of creation, the server also returns some metadata related to this particular znode.
Some of the important fields in metadata are:
CTime: Time to create this znode. Mtime: Last modified time. Dataversion: The version of the data that changes every time the data is modified DATALENGTH: the length of the data stored in the Znode. In this case, the data is helloworlddummydata and the length is 19. Numchildren: The number of subkeys for this aprticualr znode. 4.4 Modifying the data in Znode
If we want to modify the data in a particular node, zookeeper also provides a command for it. Here's how to modify the data in an existing Znode:
Set/firstnode HelloWorld
Where Firstnode is an existing znode,helloworld is the new data that needs to be written in Znode. When new data is set, the old data is deleted.
If you notice in the screenshot above the Datalength,mtime and Dataversion will also be updated when a new value is set. 4.5 Creating child nodes
Creating child nodes in an existing node is as simple as creating a new node. We only need to pass the full path of the new child node.
Create/firstnode/subnode Subnodedata
Get/firstnode/subnode
4.6 Deleting nodes
It is easy to remove a node using the RMR command in the Zookeeper CLI. Deleting a node also deletes all its child nodes. Here's the code we created for this example to delete firstnode:
Rmr/firstnode
5. Conclusion
This brings us to the conclusion of this introductory example of the Apache zookeeper. In this example, we started the introduction of zookeeper and the general architecture, and then learned how to set up zookeeper in a single machine. We also see that it is easy to use the zookeeper CLI with the Zookeeper service interface, and that commands exist in all basic interactions.
6. Download configuration file
This is where you can download the configuration file used in this example Zoo.cfg:Zookeeper Configuration