SolrCloud + Tomcat + Zookeeper cluster configuration

Last Update:2016-03-10 Source: Internet

Author: User

Tags apache solr gz file zookeeper download

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SolrCloud + Tomcat + Zookeeper cluster configuration

Overview:

SolrCloud is a distributed search solution based on Solr and Zookeeper. Its main idea is to use Zookeeper as the cluster configuration information center.

It has several special features:

1) centralized configuration information

2) Automatic Fault Tolerance

3) near real-time search

4) Automatic Load Balancing during Query

Install zookeeper

SolrCloud is a distributed search solution based on Solr and Zookeeper. To deploy solrCloud + tomcat + zookeeper clusters, you must first install zookeeper.

Installation environment:

Liux: CentOS release 6.4

JDK: 1.7.0 _ 55

Because I am studying the latest solr version, I am studying solr4.8.0 and solr4.8.0 must be running in JDK 1.7 or later versions.

1. What is zookeeper?

A: As the name suggests, zookeeper is the zoo administrator, who is the administrator of Hadoop (ELEPHANT), Hive (BEE), and pig (pig, zookeeper is used in Distributed clusters of Apache Hbase and Apache Solr. Zookeeper is a distributed and open-source program coordination service and a subproject under the hadoop project;

2. zookeeper pseudo cluster Installation

The installation I demonstrated is a standalone version, so the installation is based on a pseudo cluster. If it is a real environment, you can change the ip address of the pseudo cluster, the steps are the same. I learned how to install pseudo clusters. It is impossible to install them in a multi-environment.

Step 1: download the latest zoomed software: http://www.apache.org/dyn/closer.cgi/zookeeper/

Ubuntu 14.04 installs distributed storage Sheepdog + ZooKeeper

CentOS 6 installs sheepdog VM distributed storage

ZooKeeper cluster configuration

Use ZooKeeper to implement distributed shared locks

Distributed service framework ZooKeeper-manage data in a distributed environment

Build a ZooKeeper Cluster Environment

Test Environment configuration of ZooKeeper server cluster

ZooKeeper cluster Installation

Zookeeper3.4.6 Installation

Step 2: To test the truth, I deployed three zookeeper services on my linux

1. Create the installation directory of zookeeper

2. [root @ localhost solrCloud] # mkdir/usr/solrcocould

Copy the downloaded zookeeper-3.3.6.tar.gz file to this directory, and create three new folders under the/usr/solrcocould directory:

1. [root @ localhost solrcoulud] # ls

2. service1 service2 servive3 zookeeper-3.3.6.tar.gz

Decompress a zookeeper download package in each folder and create Several folders. The overall structure is as follows:

1. [root @ localhost service1] # ls

2. data datalog logs zookeeper-3.3.6

First, go to the data directory and create a myid file, which is written with a number. For example, if my file is server1, write 1, and server2 write 2 to the myid file, for the myid file of server3, write 3 and enter the zookeeper/conf directory. If the file is just downloaded, there will be 3 files, configuration. xml, log4j. properties, zoo_sample.cfg. The first thing we need to do is to create a zoo under this directory. cfg configuration file. Of course, you can change zoo_sample.cfg to zoo. cfg:

Zoo. cfg of service1:

# The number of milliseconds of each tick
TickTime = 2000
# The number of ticks that the initial
# Synchronization phase can take
InitLimit = 5
# The number of ticks that can pass
# Sending a request and getting an acknowledgement
SyncLimit = 2
# The directory where the snapshot is stored.
DataDir =/usr/solrcocould/service1/data
DataLogDir =/usr/solrcocould/service1/datalog
# The port at which the clients will connect
ClientPort = 2181
Server.1 = 192.168.238.htm: 2888: 3888
Server.2 = 192.168.238.133: 2889: 3889
Server.3 = 192.168.238.133: 2890: 3890

Zoo. cfg of service2:

# The number of milliseconds of each tick
TickTime = 2000
# The number of ticks that the initial
# Synchronization phase can take
InitLimit = 5
# The number of ticks that can pass
# Sending a request and getting an acknowledgement
SyncLimit = 2
# The directory where the snapshot is stored.
DataDir =/usr/solrcocould/service2/data
DataLogDir =/usr/solrcocould/service2/datalog
# The port at which the clients will connect
ClientPort = 2182
Server.1 = 192.168.238.htm: 2888: 3888
Server.2 = 192.168.238.133: 2889: 3889
Server.3 = 192.168.238.133: 2890: 3890

Zoo. cfg of service3:

# The number of milliseconds of each tick
TickTime = 2000
# The number of ticks that the initial
# Synchronization phase can take
InitLimit = 5
# The number of ticks that can pass
# Sending a request and getting an acknowledgement
SyncLimit = 2
# The directory where the snapshot is stored.
DataDir =/usr/solrcocould/service3/data
DataLogDir =/usr/solrcocould/service3/datalog
# The port at which the clients will connect
ClientPort = 2183
Server.1 = 192.168.238.htm: 2888: 3888
Server.2 = 192.168.238.133: 2889: 3889
Server.3 = 192.168.238.133: 2890: 3890

Parameter description:

TickTime: the Basic time unit used in zookeeper, in milliseconds.

InitLimit: The zookeeper cluster contains multiple servers, one of which is the leader, and the other servers in the cluster are the follower. The initLimit parameter specifies the maximum heartbeat time between follower and leader during connection initialization. this parameter is set to 5, indicating that the time limit is 5 times tickTime, that is, 5*2000 = 10000 ms = 10 s.

SyncLimit: this parameter configures the maximum length of time for sending messages, requests, and responses between the leader and follower. this parameter is set to 2, indicating that the time limit is 2 times tickTime, that is, 4000 ms.

DataDir: data storage directory. It can be any directory. But I like this.

DataLogDir: log directory, which can also be any directory. If this parameter is not set, the same settings as dataDir are used.

ClientPort: the port number that listens to the client connection.

Server. X = A: B: C where X is A number, indicating the number of the server. A is the IP address of the server. b. Configure the port used by the server and the leader in the cluster to exchange messages. c. Configure the port used to elect the leader. because the pseudo cluster mode is configured, the B and C parameters of each server must be different.

Configuration instructions:

Note that if you deploy multiple servers on one server, each server must have a different clientPort. For example, if server1 is 2181 and server2 is 2182, server3 is 2183, and dataDir and dataLogDir need to be distinguished.

The only thing to note in the last few lines is that the number server. X corresponds to the number in data/myid. You have written 1, 2, 3 to the myid files of the three servers respectively. Therefore, zoo. cfg in each server is configured with server.1, server.2, and server.3. Because on the same machine, the two ports connected to the backend and the three servers should not be the same; otherwise, the port conflict occurs. The first port is used for information exchange between cluster members, the second port is used to elect a leader when the leader fails.

Here, the configuration of zookeeper is so simple!

For more details, please continue to read the highlights on the next page:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More