Distributed index of Solrcloud and integration with zookeeper

Source: Internet
Author: User
Tags solr zookeeper client

Wang, Josh.

I. Overview

Lucene is a text retrieval class library written by the Java language, which is implemented using the inverted-row principle, and SOLR is a text retrieval application service based on Lucene. Solrcloud is a pioneering, zookeeper-based distributed search solution developed by the Solr4.0 version, and the main idea is to use zookeeper as the configuration information Center for the cluster. It can also be said that Solrcloud is a way to deploy SOLR, in addition to Solrcloud, SOLR can be deployed as a single-party and multi-machine master-slaver. Distributed indexing refers to the use of SOLR's distributed indexes when the index becomes larger, a single system fails to meet disk requirements, or when a simple query takes a lot of time. In a distributed index, the original large index is divided into small indexes, and SOLR can merge the results returned by these small indexes and return them to the client.

Second, the basic concept of Solrcloud

Solrcloud mode has cluster,node,collection,shard,leadercore,replicationcore and other important concepts.

1, cluster cluster: Cluster is a set of SOLR nodes, logically managed as a unit, the entire cluster must use the same set of schemas and solrconfig.

2. Node: A JVM instance running SOLR.

3. Collection: The complete index in the logical sense of the Solrcloud cluster is often divided into one or more shard, these shard use the same config Set, if the number of shard more than one, then the index scheme is distributed index. Solrcloud allows a client user to refer to it by collection name, so that the user does not need to be concerned about the and shard related parameters that are required for distributed retrieval.

4. Core: The SOLR core, a SOLR with one or more SOLR cores, each of which can provide index and query capabilities independently, and the SOLR core is presented for increased management flexibility and shared resources. The configuration used in Solrcloud is in zookeeper, while the traditional SOLR core configuration file is in the configuration directory on disk.

5. config SET:SOLR core provides a set of configuration files that the service must have, each Config set has a name. Minimum requirements include solrconfig.xml and schema.xml, in addition, depending on the configuration of these two files, you may also need to include other files, such as the thesaurus files required for Chinese indexes. Config set is stored in zookeeper and can be re-uploaded or updated with the Upconfig command, which can be initialized or updated using SOLR's startup parameters Bootstrap_confdir.

6. Shard Shard: Logical Shard of collection. Each shard is divided into one or more replicas, which is elected to determine which is the leader.

7, a copy of the Replica:shard. Each replica exists in a core of SOLR. In other words a solrcore corresponds to a replica, such as a collection named "Test" is created in Numshards=1, and the specified Replicationfactor is 2, which results in 2 replicas, That is, the correspondence will have 2 cores, stored on different machines or SOLR instances, one of which will be named TEST_SHARD1_REPLICA1 and the other named Test_shard1_replica2, one of them will be elected leader.

8, Leader: Win the election shard replicas, each shard has a plurality of replicas, these replicas need to elect to determine a Leader. Elections can occur at any time, but they are usually triggered only when a particular SOLR instance fails. When an index operation is performed, Solrcloud uploads the index operation request to this shard corresponding Leader,leader and distributes them to the replicas of all shard.

9, Zookeeper:zookeeper provides distributed lock function, this is necessary for Solrcloud, mainly responsible for handling the leader election. SOLR can run inline with zookeeper or use standalone zookeeper, and SOLR recommends that more than 3 hosts be preferred.

Third, the logical diagram of full index (Collection) in Solrcloud


mode collection is the access to cluster , what is the use of this portal? For example, the cluster has a lot of machines, then access to the cluster through which address, there must be an interface address, collection is the interface address. Visible collection is a logical existence, and therefore can be accessed across node , on any node collection . shard is also a logical existence, so shard is also available across node ; 1 shard can contain 0 or more replication , but 1 shard below can and can contain only one of the Span lang= "en-US" >leader
if shard below leader is hung, it will be removed from the Span lang= "en-us" >replication to elect a leader .   

It is important to note that in Solr4.0 , the corecan be added and removed from the Solr Admingui If the last core in the Shard is deleted. ,Shard will not be automatically deleted, which will cause the cluster to fail, and if all the Core in Shard is down, it will not be able to continue inserting new records, causing the query to be affected, in fact, if a all Core outages under Shard,Solrcloud should be allowed to be inserted into other surviving Shard , which in the later version of SOLR should be supported.

Iv. Basic architecture diagram for Solrcloud index operations


Shown in the figure is a cluster with 4 SOLR nodes, the index is distributed in two Shard, each shard contains two SOLR nodes, one is leader node, one is replica node, and the cluster has a overseer node responsible for maintaining the cluster state information. It is a master controller.

All state information for the cluster is maintained in the Zookeeper cluster. You can also see that any node can receive the index creation or update request, and then forward the request to the index document should belong to the Shard leader node, after the completion of the leader node update, Finally, the version number and document are forwarded to the replicas node that belongs to the same shard.

V. Working mode of Solrcloud

First look at the index and the SOLR entity control chart



Solrcloud contains multiple SOLR Instance, and each SOLR Instance contains multiple SOLR CORE,SOLR cores that correspond to an accessible SOLR index resource, each SOLR The core corresponds to a replica or leader, so that when the SOLR client accesses the SOLR cluster through collection, the Shard shard can be used to find the corresponding replica-solrcore, thus allowing access to the index document.



In Solrcloud mode, the configuration of all cores in the same cluster is unified, the core has leader and replication two roles, each core must belong to a shard, The core plays leader in Shard or replication is automatically coordinated by the SOLR internal zookeeper.

The process of accessing Solrcloud: The SOLR client consults the zookeeper with the collection address, zookeeper returns the surviving node address for access, and coordinates the data distribution internally by Solrcloud when inserting the data (using a consistent hash internally).


Vi. Solrcloud Creating indexes and updating indexes

< a >, have to know the index storage details

When the SOLR client sends a add/update request to cloudsolrserver,cloudsolrserver it connects to zookeeper to get the current Solrcloud cluster state and will be in/clusterstate.json and/ The benefits of registering watcher in live_nodes for monitoring zookeeper and Solrcloud are as follows two points:

1, Cloudsolrserver get to the state of Solrcloud, it can directly send the document to Solrcloud leader, thereby reducing network forwarding consumption.

2, registration watcher is conducive to build the index when the load balance, such as if there is a node leader offline, then Cloudsolrserver will immediately know, then it will stop the leader sent to the offline document.

In addition, Cloudsolrserver need to know which shard to send when sending document? For the built Solrcloud cluster, each shard will have a hash interval, when the document is update, Solrcloud will calculate the document hash value, Then, based on this value and the hash interval of the Shard, determine which shard,solr the document should be sent to, and use the Documentroute component to distribute the document. Currently SOLR has two subclasses of the Docrouter class Compositeidrouter (SOLR Default) class and Implicitdocrouter class, and of course we can also customize our document by inheriting Docrouter Route component.

For example, when SOLR Shard is established, SOLR assigns each shard an interval of 32bit hash value, for example, Solrcloud has two Shard, respectively, a, B, then A's hash value interval is 80000000-ffffffff, The hash value interval for B is 0-7fffffff. The default Compositeidrouter hash strategy calculates a unique hash value based on the document ID and determines which shard's hash interval is in that value.

Solrcloud provides the following two requirements for the acquisition of a hash value:

1, hash calculation speed must be fast, because the hash calculation is the first step of distributed indexing.

2, hash value must be evenly distributed in each shard, if there is a shard the number of document is greater than the other Shard, then in the query time before a shard time will be greater than the latter one, Solrcloud query is the process of first divided after the summary, In other words, each shard query is completed, so the Solrcloud query speed is determined by the slowest shard query speed.

Based on the above two points, Solrcloud adopts the MurmurHash algorithm to improve the uniform distribution of hash computation speed and hash value.

< two >, SOLR Creating an index can be divided into 5 steps (as shown):

1, the user can submit the new document to any one replica (SOLR Core).

2. If it is not leader, it will transfer the request to the leader with Shard.

3. Leader route the document to each replica of the Shard.

III, if the document is based on a routing rule (such as hash value) does not belong to the current shard,leader will transfer it to the corresponding Shard leader.

VI, the corresponding leader will route the document to each replica of the Shard.

It is important to note that when an index is added, the routing of a single document is very simple, but Solrcloud supports the bulk addition of indexes, which means that n document can normally be routed simultaneously. At this point Solrcloud will be based on document routing to separate the document, that is, the document classification, and then sent to the corresponding Shard, which requires a high concurrency capability.





< three >, update the key points of the index:

1, leader after receiving the update request, the update information is stored to the local update log, while the leader will also assign a new version of the document, for the existing document, if the new versions will discard the old version, Last sent to replica.

2. Once the document has been validated and added to version, it will be forwarded to all replica on-line in parallel. Solrcloud does not pay attention to the replica that are already offline, because there are recovery processes to recover them when they go online. If the forwarded replica is in the recovering state, then this replica will put the update into the Updatetransaction log.

3. When leader accepts that all replica feedback is successful, it will only respond to client success. As long as a replica is active in Shard, SOLR will continue to accept the update request. This strategy is actually sacrificing consistency in exchange for write validity. Here is an important parameter: the leadervotewait parameter, which indicates that when there is only one replica, the replica enters the recovering state and waits for a period of time for leader to go back online. If leader is not online during this time, then he will turn into leader, which may have some missing document. Of course, you can use majority quorum to avoid this situation, as with Zookeeper's leader election strategy, such as when the majority of replica offline, the client's write will fail.

4, there are two types of index commit, one is softcommit, that is, in memory generated segment,document is visible (can be queried) but not written to the disk, the data will be lost after power loss. The other is hardcommit, which writes data directly to the disk and the data is visible.

< four >, a few summaries of the SOLR Update Index and index creation:

1. Leader Forwarding Rules

1) Request from leader forwarding: then only need to write to the local ulog, do not need to be forwarded to leader, and do not need to forward to other replicas. If replica is in a non-active state, the update request is accepted and written to Ulog, but the index is not written. If a duplicate update is found, the old version of the update is discarded.

2) The request is not from leader, but it is leader, so you need to write the request locally and distribute it to the other replicas.

3) The request is not from leader, but it is not leader, that is, the update request is the most original update request, then you need to write the request to the local ulog, by the way to leader, and then by leader distribution. Once each commit, a ulog update log is regenerated and the data can be recovered from the Ulog when the server is hung and the memory data is lost.

2, it is best to use Cloudsolrserver when building the index, because Cloudsolrserver sends the update request directly to leader, thus avoids the network overhead.

3, batch add index, it is recommended that the client in advance to do the document routing, in Solrcloud documents routing, the cost is large.

Vii. Retrieval of Solrcloud Index



On the basis of creating good indexes, it is relatively simple to retrieve the index Solrcloud:

1, a user's query, can be sent to any SOLR containing the collection SERVER,SOLR internal processing logic will go to a replica.

2, this replica will be based on the query index, start the distributed query, based on the number of Shard index, the query into multiple sub-queries, and each sub-query to the corresponding shard any one replica.

3. Each subquery returns the result of the query.

4. The initial replica merges the subquery and returns the final result to the user.

Solrcloud provides NRT near real-time search:

Solrcloud support near real-time search, the so-called near real-time search in a short time so that the newly added document is visible, mainly based on the softcommit mechanism (note: Lucene is not softcommit, only hardcommit). As mentioned above, the data at the time of the SOLR index is written to disk at commit time, which is hard commit, the hard commit ensures that even the power outage will not lose data, and in order to provide more real-time retrieval capability, SOLR provides a soft submission method. Soft commit (soft commit) refers to the submission of data to memory only, index is visible, and is not written to the disk index file at this time. A common practice in design is to automatically trigger a hard commit every 1-10 minutes, automatically trigger a soft commit every second, and when Softcommit is done, SOLR opens a new searcher to make the new document visible, At the same time, SOLR will also warm up the cache and query to make the cached data is also visible, which must ensure that the warm-up cache and warm-up query execution time must be shorter than the frequency of commit, or it will be open too many searcher to cause a commit failure.

Finally said in the project near real-time search experience, near real-time search is relative, for the customer needs, 1 minutes is near real-time, and some demand 3 minutes is near real-time. For SOLR, the more frequent the softcommit, the more frequent the softcommit, the greater the burden of SOLR (the more frequent the commit, the smaller and more segment, and the more frequent the SOLR merge appears). At present, the Softcommit frequency in our project is 3 minutes, which has been set for 1 minutes, which makes SOLR take up too much resources in index, which greatly affects the query. So near real-time really bothers us, because customers will constantly ask you to be more real-time, currently in the project we use the addition of caching mechanism to compensate for this real-time.

Viii. the specific process of solrshard splitting

In general, increasing the number of Shard and replica can improve the query performance and disaster tolerance of solrcloud, but we still have to depend on the actual number of document, the size of the document, and the concurrency of the index, the complexity of the query, And the growth rate of the index to take into account the number of Shard and replica. SOLR relies on zookeeper for cluster management, and in zookeeper there is a znode that is/clusterstate.json, which stores the state of the entire cluster at the current moment. At the same time in a cluster and there will only be a overseer, if the current overseer fail then Solrcloud will choose a new overseer, and shard leader select similar.




The specific process of shard segmentation (Old Shard split for NewShard1 and NEWSHARD2) can be described as:

A, in a shard document to reach the threshold, or to receive the user's API commands, SOLR will start the Shard split process.

b, at this time, the original Shard will still provide services, SOLR will extract the original Shard and according to the routing rules, go to the new Shard index. At the same time, the newly added document:

1.2. The user can submit the document to any one of the replica and transfer it to leader.

3.Leader route the document to each replica of the original Shard and index it separately.

Iii. V. At the same time, the document will be routed to the new shard leader

Iv.vi. The new Shard leader will route the documents to their own replica, each indexing, in the original document re-indexing completed, the system will route the distribution document to the corresponding new leader, the original Shard closed. Shard is just a logical concept, so Shard's splitting is just the Shard of the original replica evenly divided by more than the Shard of more SOLR nodes up.

Nine, Zookeeper:

< a >, solrcloud in the use of zookeeper mainly to achieve the following three points function:



1, centralized configuration of storage and management.

2, monitoring and notification when the cluster status changes.

3, the election of Shard leader.

< two >, Znode and short links

Zookeeper's organization is similar to a file system, each layer is a znode, and each Znode stores some metadata such as creation time, modification time, and some small amount of data. The main requirement is that zookeeper does not support storing big data, it only supports data smaller than 1M, because for performance reasons, zookeeper stores the data in memory.

Zookeeper Another important concept is the short link, when the zookeeper client and zookeeper establish a short connection will be in zookeeper a new Znode, the client will always communicate with zookeeper and ensure that this znode always exists. If the client disconnects from the Zookeeper short connection, the Znode disappears. In Solrcloud,/live_nodes stores all of the client's short connections, indicating which SOLR makes up the Solrcloud, specifically when SOLR maintains a short connection with zookeeper, and these SOLR hosts form a solrcloud, If one of SOLR's short connections is broken, then live_nodes less a znode,solrcloud and a host, so zookeeper will tell the rest of SOLR there is a solr hanging off, Then in the future to query and leader data distribution, you don't have to go through that SOLR. Zookeeper is known to have SOLR hung through watch, while zookeeper maintained cluster state data is stored in the Solr/zoo_data directory.

< three >, Solrcloud Configure the basic process of zookeeper cluster

Case 1, a single-node zookeeper, contains 2 simple shard clusters: distributes a collection index data to two Shard, and assumes two shard are stored on two SOLR servers, respectively.





The basic process of cluster building:

Start with the first SOLR server:

1. Start an embedded zookeeper server as the manager of the cluster status information.

2. Register this node to the/node_states/directory.

3. Register yourself in the/live_nodes/directory at the same time.

4, create/overseer_elect/leader, for the subsequent overseer node election to prepare, a new overseer.

5, update the/clusterstate.json directory in the JSON format of the cluster status information

6, the machine updates the cluster status information from the zookeeper, and maintains the same information as the cluster on the zookeeper.

7. Upload the local configuration file to zookeeper for use by other SOLR nodes in the cluster.

8. Start the local SOLR server,

9, after the start of SOLR, Overseer will know that Shard has the first node in, Update shard status information, and the local node is set to SHARD1 leader node, and to the entire cluster to publish the latest cluster state information.

10, the machine from the zookeeper to update the cluster status information, the first SOLR server started to complete.

Then see the boot process for the second SOLR server:

1. The machine is connected to the zookeeper where the cluster resides.

2. Register this node in the/node_states/directory.

3. Register yourself in the/live_nodes/directory at the same time.

4, the machine updates the cluster status information from the zookeeper, and maintains the same information as the cluster on the zookeeper.

5. Load the configuration information required by SOLR from the configuration file saved in the cluster.

6. Start the local SOLR server.

7, after the completion of SOLR start, the node is registered as Shard in the cluster, and the native set to SHARD2 leader node.

8, the machine from the zookeeper to update the cluster status information, the second SOLR server started to complete.

Example 2, single-node zookeeper, contains 2 shard clusters with replica nodes in each shard.



, the cluster contains 2 shard, each shard has two SOLR nodes, one is leader, one is replica node, but zookeeper only one.

Because the replica node, so that the cluster now has fault tolerance, the essence of the cluster is that the overseer will monitor the leader nodes of each shard, if the leader node hangs, will start an automatic fault tolerance mechanism, A leader node is re-elected from the other replica nodes in the same shard, and even if the Overseer node itself is hung, the new Overseer node is automatically enabled on the other nodes, ensuring high availability of the cluster.

Example 3, cluster with 2 shard with Shard Backup and zookeeper cluster mechanism






The problem with example 2 is that although the SOLR server has a fault-tolerant mechanism, there is only one zookeeper server in the cluster that maintains the state information of the cluster, and the single point of presence is the source of instability. If this zookeeper server hangs, then the distributed query can still work, because each SOLR server in memory maintains the most recent zookeeper maintenance of the cluster state information, but the new node can not join the cluster, the state of the cluster is not aware of changes.

Therefore, in order to solve this problem, it is necessary to set up a cluster for the zookeeper server, so that it also has high availability and fault tolerance. There are two options, one is to provide an externally independent zookeeper cluster, and the other is to start an embedded zookeeper server for each SOLR server, and then make these zookeeper servers a cluster.

Summary: through the above introduction, you can see Solrcloud compared to SOLR, there are a lot of new features, to ensure the entire SOLR application of high availability.

1. Centralized configuration information

Use ZK for centralized configuration. At startup, you can specify that the relevant configuration file of Solr is uploaded zookeeper, which is shared by multiple machines. These ZK configurations are no longer in the local cache, and SOLR reads the configuration information directly from ZK. In addition to changes in configuration files, all machines can perceive that SOLR's tasks are also published through ZK as a medium for fault tolerance, which makes SOLR receive the task, but the machine that crashes when the task is executed, after the reboot, or when the cluster elects the candidate, can perform this unfinished task again.

2. Solrcloud the index shards and create multiple replication for each shard. Each replication is available for external service. A replication is not affected by the Indexing Service, and even more powerful, Solrcloud can automatically help you rebuild the index on the failed machine and put it into use on other machines replication.

3, near real-time search: Immediate push-replication (also support slow push), you can retrieve the new index in seconds.

4, the query automatically load balance: Solrcloud Index of multiple replication can be distributed on more than one machine, balanced query pressure, if the query pressure, can expand the machine, increase the replication to slow down.

5. Automatic distribution of indexes and index shards: sending documents to any node, Solrcloud forwards to the correct node.

6. Transaction log: The transaction log ensures that updates are not lost, even if the document is not indexed to disk.

In addition, there are some other features available in Solrcloud:

1. The index can be stored on HDFs

2. Batch index creation by Mr

3. Powerful RESTful API

Excellent management interface: the main information at a glance, can be clearly graphical way to see the deployment of Solrcloud distribution, of course, there is also an indispensable debug function.

Resources:

Http://lucene.apache.org/solr/resources.html#documentation

Http://www.cnblogs.com/phinecos/archive/2012/02/10/2345634.html

http://tech.uc.cn/?p=2387


Distributed index of Solrcloud and integration with zookeeper

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.