How to build a Solr cluster in CentOS 6.7
1. Solr Cluster System Architecture
SolrCloud is a distributed search solution provided by solr. SolrCloud is used when you need large-scale, fault-tolerant, distributed indexing and retrieval capabilities. SolrCloud is not required when the index data volume of a system is small. When the index volume is large and the search request concurrency is high, SolrCloud must be used to meet these requirements.
SolrCloud is a distributed search solution based on Solr and Zookeeper. Its main idea is to use Zookeeper as the cluster configuration information center.
For commercial purposes, please contact my dijia478@163.com
It has several special features:
1) centralized configuration information
2) Automatic Fault Tolerance
3) near real-time search
4) Automatic Load Balancing during Query
1. Physical Structure
Three Solr instances (each of which includes two cores) form a SolrCloud.
2. Logical Structure
The index set includes two Shard (shard1 and shard2), shard1 and shard2 are composed of three cores, one Leader and two Replication, and the Leader is elected by zookeeper, zookeeper controls the consistency of index data of three cores on each shard to solve the high availability problem.
You can obtain the INDEX requests from shard1 and shard2 respectively to solve the high concurrency problem.
2.1. collection
Collection is a complete logical index structure in a SolrCloud cluster. It is often divided into one or more Shard shards, which use the same configuration information.
For example, you can create a collection for product information search.
Collection = shard1 + shard2 +... + shardX
2.2. Core
Each Core is an independent operating unit in Solr and provides indexing and search services. A shard must consist of one or more cores. Because collection is composed of multiple shard, collection is generally composed of multiple cores.
2.3. Master or Slave
Master is the master node (usually the master server) in the Master-slave structure, and Slave is the slave node (usually the slave server or Slave server) in the master-slave structure ). The data stored by the master and slave in the same Shard is consistent to achieve high availability.
2.4. Shard
The logical parts of the Collection. Each Shard is converted into one or more replication, and the Leader is determined by election.
3. solr cluster architecture implemented in this tutorial
Zookeeper is used as a cluster management tool.
1. Cluster Management: Fault Tolerance and load balancing.
2. Centralized Management of configuration files
3. Cluster Portal
Zookeeper high availability is required. You need to build a cluster. We recommend that you use an odd number of nodes. Three zookeeper servers are required.
A solr cluster requires at least seven servers.
Here, due to environment restrictions, we demonstrate how to build a pseudo-distributed system (on a virtual machine, we recommend that the memory be at least 1 GB ):
Three zookeeper nodes are required.
Four tomcat nodes are required.
This article uses tomcat for deployment, instead of using jetty that comes with solr
4. System Environment
CentOS-6.7-i386-bin-DVD1
Jdk-8u151-linux-i586
Apache -- tomcat-8.5.24
Zookeeper-3.4.10
Solr-7.1.0
Note: For solr6.0 and later versions, we recommend using JDK 8 and tomcat 8. The setup steps are slightly different from those below solr6.
2. Step 1 of Zookeeper cluster Construction: Install the jdk environment.
The JDK installation process is omitted. I will not view this article:
Easy-to-understand getting started with JDK installation on CentOS 6.7
I can see the cluster building tutorial. It should be impossible to install JDK. After the installation, this is the case.
Step 2: Upload the zookeeper package to the server.
Step 3: Decompress the package.
The decompression process is omitted. I decompress the package to/usr/share/
Step 4: Copy zookeeper in three copies.
Create a directory/usr/local/solr-cloud first
Step 5: create a data directory under each zookeeper directory.
Step 6: Create a myid file under the data Directory. The file name is called "myid ". The content is the id of each instance. Example 1, 2, 3
Here I will cut a picture. The other two are shown as follows, 2 and 3 respectively.
Forget it. I'm afraid some people won't... See what I executed under the solr-cloud directory.
Step 7: Rename the zoo_sample.cfg file under the conf directory to zoo. cfg
This time I really only demonstrated one. The other two photos can actually do step 5 and Step 7 before step 4, which is my negligence.
Step 8: modify the configuration file zoo. cfg.
Only the first one is demonstrated, and the other two are modified by themselves. Only change the Directory and port number in the first two red boxes. The content in the last red box is the same as that in the three configuration files.
This 1 in server.1 is the content in step 6 above. In actual work, each instance is on a different server, so the following ip address should be different. Here I am demonstrating on a virtual machine, so the ip address is the same.
Step 9: Start each zookeeper instance
It is very troublesome to start them one by one in the directory. I will write a simple script for you.
?
123456 |
cd /usr/local/solr-cloud/zookeeper01/bin/ ./zkServer.sh start cd /usr/local/solr-cloud/zookeeper02/bin/ ./zkServer.sh start cd /usr/local/solr-cloud/zookeeper03/bin/ ./zkServer.sh start |
No execution permission is found after writing. Add permission:
Run the script.
For verification, go to the bin directory of the three zookeeper instances to view the status of each instance.
(At the beginning, I wrote the command to view the status in the script, so that it can be viewed as soon as it is started, but not running is displayed every time. I thought about it later, it should be because the script is executed too fast, if the startup command is executed but not started, check the status, so the command is not running)
If you show two leaders like this (leader and follower are not necessarily random), it means the zookeeper cluster has been set up.
Step 1 is complete. Set up a solr Cluster
For more details, please continue to read the highlights on the next page: