The construction, use, experience and lessons of Solrcloud and zookeeper

Source: Internet
Author: User
Tags curl file upload solr zookeeper

The company's SOLR online servers are divided into almost 10 cores to be used by different business needs of different departments. Since I took over, I've had a lot of problems and troubles. There are many problems to be solved, the first is the search accuracy, data synchronization. Search accuracy has been resolved by using the ANSJ word breaker and the constantly optimized personal thesaurus and deactivation thesaurus, which is an ongoing optimization process that takes a long time to follow to make a significant impact. The second data synchronization problem includes fast new search core, search performance load, data synchronization, and downtime recovery. Previously, a variety of decentralized treatment schemes. Data synchronization I wrote a rest of my own WebService interface to be implemented separately, downtime recovery can even only manual processing, performance load is it operation and maintenance department to solve. Now most of the deprecation, directly on the Solrcloud.

It took almost a week to get the solrcloud done. In fact, set up a two-hour OK, time is spent in understanding the principle and advanced use, a variety of problems, headaches, this week also added a lot of classes, really hard. Sometimes think really no Zuo no die, in fact, the old plan is completely can be used, and is his hard written, now to overturn again, and to understand deeply, casually a problem can burn, I have done nutch, Hadoop people, Now, the degree of headache in Solrcloud is unprecedented. Solrcloud+zookeeper Online data can be more than a few, but 10,000 of articles are able to synthesize an article, basically are written how to build, and then no, all kinds of plagiarism, all kinds of donuts. This is also a bad phenomenon in the Chinese open source sector, we have repeatedly shared the existing technology, difficult technology, we do not want to share. Results to find information, Baidu turned dozens of pages, the content is very poor. The egg-sore Google can't. Wiki documents are mostly in English and rarely translated, thanks to some of the brothers who translated the Solrcloud wiki. So wordy said a lot of, in fact, ultimately want to advise to do the development, do something more original, read more documents, more understanding of the principle rather than the same. Well, write down my experiences and lessons in the development process.
1,zookeeper installation.

Single-machine distributed all can, online data many, this is also easier than Hadoop build, later interested in preparing to write a script, one-click installation of Zookeeper Services. 2,SOLR installation.

Note When installing on Linux, Solr.xml do not edit and upload on Windows ... It's a small thing, but it's easy to cause an unknown bug. Blood lessons, zookeeper synchronizing data will rewrite Solr.xml, because Windows and Linux are different things that will make it impossible to rewrite on Linux. Causes the Slor service to fail to load SOLR's core after a restart and is in a down state. If you have a solrcloud in the state of the SOLR physical node is down, please VI solr.xml, see if there is a lot of ~m (often with Linux people understand). It is also advisable to remove all comments in the solr.xml. No need to create core in Solr.xml, we have strong zookeeper, ' (*∩_∩*)
3, upload the core configuration file.

Here is the understanding of the things, once the understanding of everything to say. The file system on zookeeper is similar to HDFs:/zkcli.sh-server localhost:2181
You can view the zookeeper Distributed file structure, all the core configuration files in Confs, all the core (the cloud uses the concept of collection) in collections. The upload configuration file command is as follows:
Java-classpath.:/ usr/local/tomcat7/webapps/solr/web-inf/lib/* org.apache.solr.cloud.zkcli-cmd Upconfig-zkhost hadoop34:2181, Hadoop36:2181-confdir/usr/local/soft/solr-space/alpha_wenuser/conf-confname Alpha_wenuser
Here, the Alpha_wenuser collection (core) configuration file is uploaded to two distributed machines. If you have dataimport related in your configuration, you will get an error when creating a core (collection) here or later, you need to place the DataImport related jar in Solr's lib under Tomcat, rather than the previous standalone SOLR solr_ Home under the Dist.
4, create the collection.

Once the configuration file upload OK, you can create the collection as you like, do not need the operation of the local egg pain, a command is enough. The command is as follows
Curl ' http://hadoop36/solr/admin/collections?action=create&name=alpha_wenuser&numshards=1& Replicationfactor=1&collection.configname=alpha_wenuser '
The specific meaning, Baidu a bit, we all know, will not wordy.
5, downtime test.

The above OK Solrcloud on the OK. What, you say how can be so simple, OK, it is so simple. The hard part is the bug. If you look at my actions above, basically all the bugs can be avoided. Now casually shut down a SOLR machine, test the leader whether the switch is successful, and then restart the SOLR machine, see if this SOLR node is down or active, if it is down, okay brother, To see Solr.xml bar, there must be a problem I just said, the solution is to delete, directly into the example Solr.xml bar. Then delete the set, the command is as follows
Curl ' Http://hadoop36/solr/admin/collections?action=DELETE&name=alpha_wenuser '
Repeat step 4, step 5.


The above is very messy, in fact, this article is not to tell you to build a detailed method of Solrcloud, but the construction and use of the process may arise problems and the place to pay attention to.
Bug fix Area:


1,java.lang.unsupportedoperationexception
The solution is to check that the _version field under Scheam.xml is not set to a long type, that the new version of SOLR's _version does not use the string type and will report such errors if it continues to be used.

Resources:
Http://shiyanjun.cn/archives/100.html
This step is very reliable, trustworthy, of course, you have to understand the first, he also missed some things
http://blog.csdn.net/natureice/article/details/9109351
This ditto, also very worthy of reference
Http://www.cnblogs.com/guozk/p/3498844.html
This has a variety of operation commands, pretty detailed
Thanks to these sharing experiences, the Internet is really good!

Iteye log moved to csdn problem, had to manually select the essence, copy over, absolutely I original.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.