Searching for data in MongoDB (2)

Source: Internet
Author: User

MongoDB is used in most cases as a module of data storage, and as a database, it should not generally assume more tasks.

From a professional point of view, the task of text search to the professional search engine to take responsibility, is often a better choice.

Commonly used search engines and MongoDB often have a ready-made tool, can be easily combined.

1, Sphinx and Mongodb-sphinx

Sphinx is a text search engine written in C + +, which itself is very well combined with MySQL, and it is very easy to import data from MySQL.

For other databases, Sphinx does not provide native support, but Sphinx provides a xmlpipe2 interface that any program can interact with Sphinx as long as the corresponding interface is implemented.

For MongoDB, Mongodb-sphinx (Https://github.com/georgepsarakis/mongodb-sphinx) is the implementation of a XMLPIPE2 interface.

Mongo-sphinx with a StackOverflow sample data, as well as a running parameter sample, simply import the sample data into MongoDB and execute the following command to import the data to Sphinx

./mongodb-sphinx.py-d stackoverflow-c posts--text-fields profile_image link--attributes last_activity_date _id--attr Ibute-types Timestamp string--timestamp-from=1366045854--id-field=post_id

Common parameters include the following:

-D Specify the database,-c specifies the collection,-h specifies the address of MongoDB,-p specifies the port of MongoDB

-F start Timestamp,-u end timestamp,-T needs to establish the field of the search index

-a property that is not indexed,--attribute-types is a property in-a that specifies the property type including string, Timestamp, integer, and so on

--id-field field used as document ID,--threads number of threads

It is very important that the _id in the Mongodb-sphinx default MongoDB data is Objectid, which is the ID with the time information, so if you need to use your own ID system there will be a problem with time judgment and you need to modify the code yourself.

2, Elasticsearch and Mongo-connector

In es2.0 and previous versions, it was mongodb-river that often used to combine data with MongoDB.

However, in the post-es5 version, the plugin can no longer be installed like the previous version, so the online Mongodb-river tutorials are not available.

At the same time Mongodb-river has not been updated for several years, may be less support for ES5 than other programs.

MongoDB Official offers a similar tool mongo-connector (Https://github.com/mongodb-labs/mongo-connector)

The installation method is very simple: Pip install Mongo-connector

Mongo-connector supports a variety of different search engines, and for ES supports multiple versions of 1.x,2.x,5.x, just install the corresponding Doc-manager

can also be used directly, pip install ' mongo-connector[elastic5 ' installation, can be used directly.

Before use, you need to switch MongoDB to replica set mode, so MongoDB will log oplog.

$ mongod--replset singlenoderepl$ mongo> rs.initiate () # MongoDB is now running on port 27017

After that, edit a configuration file, such as configuring password information, and so on:

{"Authentication": {"password": XXX}}

The official comes with a sample configuration file:

{"__comment__": "Configuration options starting with ' __ ' is disabled", "__comment__": "To enable them, remove the Preceding ' __ ', "mainaddress": "localhost:27017", "Oplogfile": "/var/log/mongo-connector/oplog.timestamp", "NoD        UMP ": false," batchsize ":-1," verbosity ": 0," ContinueOnError ": false," logging ": {" type ":" File ", "FileName": "/var/log/mongo-connector/mongo-connector.log", "__format": "% (asctime) s [% (levelname) s]% (name) s:        % (Lineno) d-% (message) S "," __rotationwhen ":" D "," __rotationinterval ": 1," __rotationbackups ": 10, "__type": "Syslog", "__host": "localhost:514"}, "Authentication": {"__adminusername": "Usernam E "," __password ":" Password "," __passwordfile ":" Mongo-connector.pwd "}," __comment__ ":" For more info        Rmation about SSL with MongoDB, please see http://docs.mongodb.org/manual/tutorial/configure-ssl-clients/"," __ssl ": { "__sslcertfile":"Path to certificate-identify the local connection against MongoDB", "__sslkeyfile": "Path to the private key fo R Sslcertfile. Not necessary if already included in Sslcertfile. "," __sslcacerts ":" Path to concatenated set of certificate author ity certificates to validate the other side of the connection "," __sslcertificatepolicy ":" Policy for validating SS L certificates provided from the other end of the connection.  Possible values is ' required ' (Require and validate certificates), ' optional ' (validate but don ' t require a certificate), and ' ignored ' (Ignore certificates). "}," __fields ": [" field1 "," Field2 "," field3 "]," __namespaces ": {" E         Xcluded.collection ": false," excluded_wildcard.* ": false," *.exclude_collection_from_every_database ": false, "Included.collection1": True, "Included.collection2": {}, "Included.collection4": {"incl    Udefields ": [" Included_field "," Included.nested.field "]},    "Included.collection5": {"rename": "Included.new_collection5_name", "Includefields": ["included _field "," Included.nested.field "]}," Included.collection6 ": {" Excludefields ": [" Excluded_field " , "Excluded.nested.field"]}, "included.collection7": {"rename": "Included.new_collection7_name"        , "Excludefields": ["Excluded_field", "Excluded.nested.field"]}, "included_wildcard1.*": true,            "included_wildcard2.*": True, "Renamed.collection1": "Something.else1", "Renamed.collection2": {        "Rename": "Something.else2"}, "renamed_wildcard.*": {"rename": "New_name.*"},        "Gridfs.collection": {"Gridfs": true}, "gridfs_wildcard.*": {"Gridfs": True }}, "Docmanagers": [{"Docmanager": "Elastic_doc_manager", "TargetUrl": "Localhos T:9200 "," __bUlksize ": +," __uniquekey ":" _id "," __autocommitinterval ": null}]} 

  

The mongo-connector-c config.json command is then executed to begin data synchronization.

Searching for data in MongoDB (2)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.