Zookeeper cluster (pseudo cluster) Building tutorial, zookeeper Cluster
What is zookeeper?
What can Zookeeper do?
As the name suggests, zookeeper is the zoo administrator. It is the administrator of hadoop (ELEPHANT), Hive (BEE), pig (pig), and Apache Hbase and Apache Solr distributed clusters use zookeeper; zookeeper: it is a distributed and open-source program coordination service and a subproject under the hadoop project.
1. Configuration Management
In addition to code, there are also various configurations in our applications. For example, database connection. Generally, we use configuration files to introduce these configuration files in code. However, when we only have one configuration and only one server, which is not often modified, it is a good practice to use the configuration file, but if we have many configurations, many servers need this configuration, and it is not a good idea to use the configuration file dynamically. At this time, we often need to find a way to manage configurations in a centralized manner. We have modified the configurations in this centralized area, and all the configurations that are interested in this configuration can be changed. For example, we can put the configuration in the database, and then all the services that need to be configured read the configuration from the database. However, because many services are dependent on this configuration for normal operation, the service that provides the configuration service in a centralized manner has high reliability. Generally, we can use a cluster to provide this configuration service, but we can use a cluster to improve the reliability. How can we ensure the consistency of the configuration in the cluster? At this time, we need to use a service that implements the consistency protocol. Zookeeper is a service that uses the Zab protocol to provide consistency. Currently, many open-source projects use Zookeeper to maintain configurations. For example, in HBase, the client connects to a Zookeeper to obtain the necessary HBase cluster configuration information before further operations can be performed. In the open-source Message Queue Kafka, Zookeeper is also used to maintain broker information. In the open-source SOA framework Dubbo of Alibaba, Zookeeper is also widely used to manage some configurations for service governance.
2. Name Service
The name service is easy to understand. For example, in order to access a system through the network, we need to know the IP address of the other party, but the IP address is very unfriendly to people. In this case, we need to use a domain name to access the system. But the computer cannot be a domain name. What should we do? If we have a domain name-to-IP address ing in each machine, this can solve some problems, but what should we do if the IP address corresponding to the domain name changes? So we have DNS. We only need to access a well-known (known) point, it will tell you what the corresponding IP address of this domain name is. There will also be many such problems in our applications, especially when there are a lot of services, it will be very inconvenient if we save the service address locally, however, if we only need to access a well-known Access Point and provide a unified entrance here, it will be much easier to maintain.
3. Distributed locks
In fact, the first article introduced Zookeeper as a Distributed Coordination Service. In this way, we can use Zookeeper to coordinate activities between multiple distributed processes. For example, in a distributed environment, the same service is deployed on each server of our cluster to improve reliability. However, if every server in the cluster performs one thing, it is necessary to coordinate with each other and programming will be very complicated. If we only perform operations on one service, there is a single point of failure. There is usually another way to use distributed locks, so that only one service can work at a time. When this service fails, the lock is released and fail is over to another service immediately. This is done in many distributed systems. This design has a better name: Leader Election ). For example, HBase Master adopts this mechanism. However, it should be noted that the locks of distributed locks are different from those of the same process. Therefore, the locks of the same process should be used more cautiously.
4. Cluster Management
In distributed clusters, some nodes often come in and out due to various reasons, such as hardware faults, software faults, and network problems. New nodes are added, and old nodes are exited from the cluster. At this time, other machines in the cluster need to perceive this change and then make corresponding decisions based on this change. For example, we are a distributed storage system with a central control node responsible for storage allocation. When new storage comes in, we need to allocate storage nodes according to the current status of the cluster. At this time, we need to dynamically perceive the current status of the cluster. Also, in a distributed SOA architecture, a service is provided by a cluster. When a consumer accesses a service, some mechanism is required to discover which nodes can provide the service (this is also called service discovery. For example, Dubbo, the open-source SOA framework of Alibaba, uses Zookeeper as the underlying mechanism of service discovery ). In addition, the open-source Kafka queue uses Zookeeper as the online/Offline Management of Cosnumer.
Environment setup:
Environment preparation:
Centos6.5
Jdk1.7
Environment installation:
Centos6.5: omitted
Jdk1.7: omitted
Zookeeper installation:
1. Upload and decompress the zookeeper installation package
Tar-zxf zookeeper-3.4.6.tar.gz?
2. Create the solrcloud directory in/usr/local and copy the extracted files to the directory in three copies. They are named zookeeper1, zookeeper2, and zookeeper3 respectively.
Mkdir/usr/local/solrcloud
Mv zookeeper-3.4.6/usr/local/solrcloud/zookeeper1
Cd/usr/local/solrcloud
Cp-r zookeeper1/zookeeper2
Cp-r zookeeper1/zookeeper3
3. Configure zookeeper
Create a data directory under each zookeeper folder, and create a myid file under the data Directory. The file content is zookeeper numbers 1, 2, 3
Mkdir zookeeper1/data
Echo 1> zookeeper1/data/myid
4. Copy the zoo_sample.cfg file in the conf directory under zookeeper1 and change it to zoo. cfg.
Modify the zoo. cfg file:
DataDir =/usr/local/solrcloud/zookeeper1/data/
ClientPort = 2181
Server.1 = 192.168.132.132: 2881: 3881
Server.2 = 192.168.132.132: 2882: 3882
Server.3 = 192.168.132.132: 2883: 3883
2881 and 3881 can be written without conflict. Voting port and election port.
Zookeeper2 and zookeeper3 repeat the preceding steps:
5. Start zookeeper
Go to the zookeeper1/bin directory
Start:./zkServer. sh start
Zookeeper1/bin/zkServer. sh start
Zookeeper2/bin/zkServer. sh start
Zookeeper3/bin/zkServer. sh start
Stop:./zkServer. sh stop
View:./zkServer. sh status
Zookeeper1/bin/zkServer. sh status
Zookeeper2/bin/zkServer. sh status
Zookeeper3/bin/zkServer. sh status