On the integration of the search engine SOLR and MongoDB

Source: Internet
Author: User
Tags install mongodb mongodb server solr mongo shell pkill

Environment:

Ubuntu 12.04  Solr 5.1.0  MongoDB db version:v2.0.4

1. SOLR Configuration and MongoDB installation

SOLR installation configuration is now very simple, refer to the Official document: Http://lucene.apache.org/solr/quickstart.html, the official document is the cloud example (-e designation), and finally, I am using the techproducts, the Basic command is as follows:

/:$ ls solr*solr-5.1.0.zip/:$ unzip-q solr-5.1.0.zip/:$ cd SOLR-5.1.0/BIN/SOLR start-e techproducts-noprompt

According to the official documentation, if you want to close SOLR and clear the data under this sample after you have exhausted it, run

BIN/SOLR Stop-all; RM-RF example/techproducts/

The reason for these commands is that you can start, close, clear, and OK throughout the process.

Here very want to vomit groove on the network of those amateur configuration document, usually step a large document is best not to look at, or the whole head, to the last sad or you.

MongoDB installation in Ubuntu looks abnormal fool, Apt-get is the best.

2. SOLR's integration with MongoDB

From SOLR's official QuickStart document, it can search for XML, JSON, CSV and many other documents, but it does not see that this thing can be integrated with MongoDB, but the almighty man can always find a way to get them together, perhaps there is an almighty God.

Fortunately, leader said: "Can, casually threw me a link http://www.cnblogs.com/sysuys/p/3403670.html." But when I saw that the SOLR version was 4.5, my heart was broken, but there was no way to find a better way.

So, I play a fear of death, two not afraid of the spirit of bitterness, I bite the bullet down to read, suddenly, an oasis appeared in My eyes, this document has a GitHub address, and it is the use of GitHub this thing ah.

Yes, that's the MONGO-connector https://github.com/10gen-labs/mongo-connector/wiki/Getting-Started, look, We've got another official document, uh, it's a reliable one. But the final result makes me decide that I still have to sort out a configuration document that is appropriate to the development needs of your own country.

Forget it, too much nonsense, direct steps:

1). Set up MongoDB's replica set (copy set, presumably so translated).

After using Apt-get to install MongoDB, the system Will Self-boot MongoDB server, we will kill it:

Pkill Mongod

Then specify the replica set to start:

Mongod--replset Mydevreplset

MongoDB termination and start seems to be so simple, if you start like above, it is running in the foreground, you need to end the direct CTRL + C, if you start with &amp, it runs in the background, of course, you have to use Pkill or kill.

The replica set is then initialized under the MONGO shell:

[Email protected]:~/solr-5.1.0$ mongomongodb Shell version:2.0.4connecting to:testprimary> rs.initiate ()
This time MongoDB this side of the fix, very simple, is to add a replica set.

2). Install Mongo-connector

Installation reference Https://github.com/10gen-labs/mongo-connector, very simple, a command:

Pip Install Mongo-connector

It would be nice if the hint didn't python-pip,apt-get a bit. Mongo-connector This middleware is good, but do not rush to use, because this thing to read SOLR configuration file, so in Solr some place, and then use this is just a command.

3). Configuration at SOLR end

If this reference mongo-connector and other articles can be a big hole.

A>

First they will let you modify SOLR's schema.xml this file, but in the process of being familiar with SOLR, even the hair of this file did not see, where to change ah.

There is no way, this time to find also need to find, do not find, use the powerful find command:

[Email protected]:~/solr-5.1.0$ Find. -name "Schema.xml"./server/solr/configsets/basic_configs/conf/schema.xml./server/solr/configsets/sample_ techproducts_configs/conf/schema.xml./example/example-dih/solr/tika/conf/schema.xml./example/example-dih/solr/ mail/conf/schema.xml./example/example-dih/solr/solr/conf/schema.xml./example/example-dih/solr/rss/conf/ schema.xml./example/example-dih/solr/db/conf/schema.xml./example/techproducts/solr/techproducts/conf/ Schema.xml

OK, corresponding to our previous Techproducts sample, it seems that the results of two related, there is this keyword, that is probably the two, of course, this time according to experience should choose to modify the first one, because the previous mentioned can clear the user's example, That is, the second related schemal.xml is can be deleted, immediately produce a bad feeling, this thing is sure SOLR open automatically generated, and how to generate it, it may be the first position above the schema.xml copy, after verification, sure enough.

Go to the point and find the place to change, then, modify it:

Open it

VI./server/solr/configsets/sample_techproducts_configs/conf/schema.xml

Will

<uniqueKey>id</uniqueKey>

Replaced by

<uniqueKey>_id</uniqueKey>

Add again

<field name= "_id" type= "string" indexed= "true" stored= "true"/><field name= "_ts" type= "Long" indexed= "true" Stored= "true"/><field name= "ns" type= "string" indexed= "true" stored= "true"/>

Preferably these places in a piece, make a mark, in the future convenient query well, in addition, here to save the pit:

Comment out the original

<field name= "id" type= "string" indexed= "true" stored= "true" required= "true" multivalued= "false"/>

Otherwise, add a JSON to SOLR, or the XML will require this field ID, because required= "true", these things are found after I encountered a problem.

That's what Schema.xml's modification is.

B>

There is another hole here, in fact, we also need to modify the solrconfig.xml, to say how these things found, was the pit after the touch out of all.

Open it:

VI./server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml

Will

<requesthandler name= "/admin/luke"       class= "Solr.admin.LukeRequestHandler"/>

Explanation, this thing to be used by Mongo-connector, Mongo-connector will request to obtain the above schema.xml, it is this handler to deal with this request, so this is very important. And it seems that official documents think that we are the default open, and did not mention this matter, so said the giant pit!

C>

The configuration file is the above two, this block is more complex, I do not dare to ensure that other versions of SOLR also.

Finally, we follow the previous said to close Solr, clear the example/techproducts directory, restart SOLR again, reboot techproducts sample to produce some errors, it is because modified Schema.xml, uniquekey into a The _id, not the ID, will produce these errors, but these can be ignored and no errors will indicate a problem. Then you will find that the two configuration files are copied into the Exmaple/techproducts sample configuration file, as stated above.


The configuration of the OK,SOLR end is complete.

4). Use Mongo-connector to connect SOLR with MongoDB

If you once again refer to the official documentation on GitHub, congratulations, you have jumped again in the pit, in the present case, please run:

Mongo-connector--auto-commit-interval=1-d solr_doc_manager-t http://localhost:8983/solr/techproducts

Note, unlike the official is this-t behind the link, because we use the Techproducts sample, so we should add this name, in fact, a open is completely do not know what this link is to do, is all kinds of mistake, the mouse points out, This--auto-commit-interval parameter is set to 0, from the experimental results, it should be that the data in MongoDB will not be written to solr side of the meaning, seems to be the official document completely opposite, to continue to verify.


The end result is that you add a piece of data to MongoDB, and the update is displayed when you query with the *:* in solr after 1s.

On the integration of the search engine SOLR and MongoDB

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.