How to build a Solr cluster in CentOS 6.7

Source: Internet
Author: User

How to build a Solr cluster in CentOS 6.7
1. Solr Cluster System Architecture

SolrCloud is a distributed search solution provided by solr. SolrCloud is used when you need large-scale, fault-tolerant, distributed indexing and retrieval capabilities. SolrCloud is not required when the index data volume of a system is small. When the index volume is large and the search request concurrency is high, SolrCloud must be used to meet these requirements.

SolrCloud is a distributed search solution based on Solr and Zookeeper. Its main idea is to use Zookeeper as the cluster configuration information center.

For commercial purposes, please contact my dijia478@163.com

It has several special features:

1) centralized configuration information

2) Automatic Fault Tolerance

3) near real-time search

4) Automatic Load Balancing during Query

1. Physical Structure

Three Solr instances (each of which includes two cores) form a SolrCloud.

2. Logical Structure

The index set includes two Shard (shard1 and shard2), shard1 and shard2 are composed of three cores, one Leader and two Replication, and the Leader is elected by zookeeper, zookeeper controls the consistency of index data of three cores on each shard to solve the high availability problem.

You can obtain the INDEX requests from shard1 and shard2 respectively to solve the high concurrency problem.

2.1. collection

Collection is a complete logical index structure in a SolrCloud cluster. It is often divided into one or more Shard shards, which use the same configuration information.

For example, you can create a collection for product information search.

Collection = shard1 + shard2 +... + shardX

2.2. Core

Each Core is an independent operating unit in Solr and provides indexing and search services. A shard must consist of one or more cores. Because collection is composed of multiple shard, collection is generally composed of multiple cores.

2.3. Master or Slave

Master is the master node (usually the master server) in the Master-slave structure, and Slave is the slave node (usually the slave server or Slave server) in the master-slave structure ). The data stored by the master and slave in the same Shard is consistent to achieve high availability.

2.4. Shard

The logical parts of the Collection. Each Shard is converted into one or more replication, and the Leader is determined by election.

3. solr cluster architecture implemented in this tutorial

Zookeeper is used as a cluster management tool.

1. Cluster Management: Fault Tolerance and load balancing.

2. Centralized Management of configuration files

3. Cluster Portal

Zookeeper high availability is required. You need to build a cluster. We recommend that you use an odd number of nodes. Three zookeeper servers are required.

A solr cluster requires at least seven servers.

Here, due to environment restrictions, we demonstrate how to build a pseudo-distributed system (on a virtual machine, we recommend that the memory be at least 1 GB ):

Three zookeeper nodes are required.

Four tomcat nodes are required.

This article uses tomcat for deployment, instead of using jetty that comes with solr

4. System Environment

CentOS-6.7-i386-bin-DVD1

Jdk-8u151-linux-i586

Apache -- tomcat-8.5.24

Zookeeper-3.4.10

Solr-7.1.0

Note: For solr6.0 and later versions, we recommend using JDK 8 and tomcat 8. The setup steps are slightly different from those below solr6.

2. Step 1 of Zookeeper cluster Construction: Install the jdk environment.

The JDK installation process is omitted. I will not view this article:

Easy-to-understand getting started with JDK installation on CentOS 6.7

I can see the cluster building tutorial. It should be impossible to install JDK. After the installation, this is the case.

Step 2: Upload the zookeeper package to the server.

Step 3: Decompress the package.

The decompression process is omitted. I decompress the package to/usr/share/

Step 4: Copy zookeeper in three copies.

Create a directory/usr/local/solr-cloud first

Step 5: create a data directory under each zookeeper directory.

 

Step 6: Create a myid file under the data Directory. The file name is called "myid ". The content is the id of each instance. Example 1, 2, 3

Here I will cut a picture. The other two are shown as follows, 2 and 3 respectively.

 

Forget it. I'm afraid some people won't... See what I executed under the solr-cloud directory.

Step 7: Rename the zoo_sample.cfg file under the conf directory to zoo. cfg

This time I really only demonstrated one. The other two photos can actually do step 5 and Step 7 before step 4, which is my negligence.

 

Step 8: modify the configuration file zoo. cfg.

Only the first one is demonstrated, and the other two are modified by themselves. Only change the Directory and port number in the first two red boxes. The content in the last red box is the same as that in the three configuration files.

This 1 in server.1 is the content in step 6 above. In actual work, each instance is on a different server, so the following ip address should be different. Here I am demonstrating on a virtual machine, so the ip address is the same.

Step 9: Start each zookeeper instance

It is very troublesome to start them one by one in the directory. I will write a simple script for you.

?
123456 cd /usr/local/solr-cloud/zookeeper01/bin/./zkServer.sh startcd /usr/local/solr-cloud/zookeeper02/bin/./zkServer.sh startcd /usr/local/solr-cloud/zookeeper03/bin/./zkServer.sh start

No execution permission is found after writing. Add permission:

Run the script.

For verification, go to the bin directory of the three zookeeper instances to view the status of each instance.

(At the beginning, I wrote the command to view the status in the script, so that it can be viewed as soon as it is started, but not running is displayed every time. I thought about it later, it should be because the script is executed too fast, if the startup command is executed but not started, check the status, so the command is not running)

If you show two leaders like this (leader and follower are not necessarily random), it means the zookeeper cluster has been set up.

Step 1 is complete. Set up a solr Cluster

For more details, please continue to read the highlights on the next page:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.