MongoDB is used in most cases as a module of data storage, and as a database, it should not generally assume more tasks.
From a professional point of view, the task of text search to the professional search engine to take responsibility, is often a better choice.
Commonly used search engines and MongoDB often have a ready-made tool, can be easily combined.
1, Sphinx and Mongodb-sphinx
Sphinx is a text search engine written in C + +, which itself is very well combined with MySQL, and it is very easy to import data from MySQL.
For other databases, Sphinx does not provide native support, but Sphinx provides a xmlpipe2 interface that any program can interact with Sphinx as long as the corresponding interface is implemented.
For MongoDB, Mongodb-sphinx (Https://github.com/georgepsarakis/mongodb-sphinx) is the implementation of a XMLPIPE2 interface.
Mongo-sphinx with a StackOverflow sample data, as well as a running parameter sample, simply import the sample data into MongoDB and execute the following command to import the data to Sphinx
./mongodb-sphinx.py-d stackoverflow-c posts--text-fields profile_image link--attributes last_activity_date _id--attr Ibute-types Timestamp string--timestamp-from=1366045854--id-field=post_id
Common parameters include the following:
-D Specify the database,-c specifies the collection,-h specifies the address of MongoDB,-p specifies the port of MongoDB
-F start Timestamp,-u end timestamp,-T needs to establish the field of the search index
-a property that is not indexed,--attribute-types is a property in-a that specifies the property type including string, Timestamp, integer, and so on
--id-field field used as document ID,--threads number of threads
It is very important that the _id in the Mongodb-sphinx default MongoDB data is Objectid, which is the ID with the time information, so if you need to use your own ID system there will be a problem with time judgment and you need to modify the code yourself.
2, Elasticsearch and Mongo-connector
In es2.0 and previous versions, it was mongodb-river that often used to combine data with MongoDB.
However, in the post-es5 version, the plugin can no longer be installed like the previous version, so the online Mongodb-river tutorials are not available.
At the same time Mongodb-river has not been updated for several years, may be less support for ES5 than other programs.
MongoDB Official offers a similar tool mongo-connector (Https://github.com/mongodb-labs/mongo-connector)
The installation method is very simple: Pip install Mongo-connector
Mongo-connector supports a variety of different search engines, and for ES supports multiple versions of 1.x,2.x,5.x, just install the corresponding Doc-manager
can also be used directly, pip install ' mongo-connector[elastic5 ' installation, can be used directly.
Before use, you need to switch MongoDB to replica set mode, so MongoDB will log oplog.
$ mongod--replset singlenoderepl$ mongo> rs.initiate () # MongoDB is now running on port 27017
After that, edit a configuration file, such as configuring password information, and so on:
{"Authentication": {"password": XXX}}
The official comes with a sample configuration file:
{"__comment__": "Configuration options starting with ' __ ' is disabled", "__comment__": "To enable them, remove the Preceding ' __ ', "mainaddress": "localhost:27017", "Oplogfile": "/var/log/mongo-connector/oplog.timestamp", "NoD UMP ": false," batchsize ":-1," verbosity ": 0," ContinueOnError ": false," logging ": {" type ":" File ", "FileName": "/var/log/mongo-connector/mongo-connector.log", "__format": "% (asctime) s [% (levelname) s]% (name) s: % (Lineno) d-% (message) S "," __rotationwhen ":" D "," __rotationinterval ": 1," __rotationbackups ": 10, "__type": "Syslog", "__host": "localhost:514"}, "Authentication": {"__adminusername": "Usernam E "," __password ":" Password "," __passwordfile ":" Mongo-connector.pwd "}," __comment__ ":" For more info Rmation about SSL with MongoDB, please see http://docs.mongodb.org/manual/tutorial/configure-ssl-clients/"," __ssl ": { "__sslcertfile":"Path to certificate-identify the local connection against MongoDB", "__sslkeyfile": "Path to the private key fo R Sslcertfile. Not necessary if already included in Sslcertfile. "," __sslcacerts ":" Path to concatenated set of certificate author ity certificates to validate the other side of the connection "," __sslcertificatepolicy ":" Policy for validating SS L certificates provided from the other end of the connection. Possible values is ' required ' (Require and validate certificates), ' optional ' (validate but don ' t require a certificate), and ' ignored ' (Ignore certificates). "}," __fields ": [" field1 "," Field2 "," field3 "]," __namespaces ": {" E Xcluded.collection ": false," excluded_wildcard.* ": false," *.exclude_collection_from_every_database ": false, "Included.collection1": True, "Included.collection2": {}, "Included.collection4": {"incl Udefields ": [" Included_field "," Included.nested.field "]}, "Included.collection5": {"rename": "Included.new_collection5_name", "Includefields": ["included _field "," Included.nested.field "]}," Included.collection6 ": {" Excludefields ": [" Excluded_field " , "Excluded.nested.field"]}, "included.collection7": {"rename": "Included.new_collection7_name" , "Excludefields": ["Excluded_field", "Excluded.nested.field"]}, "included_wildcard1.*": true, "included_wildcard2.*": True, "Renamed.collection1": "Something.else1", "Renamed.collection2": { "Rename": "Something.else2"}, "renamed_wildcard.*": {"rename": "New_name.*"}, "Gridfs.collection": {"Gridfs": true}, "gridfs_wildcard.*": {"Gridfs": True }}, "Docmanagers": [{"Docmanager": "Elastic_doc_manager", "TargetUrl": "Localhos T:9200 "," __bUlksize ": +," __uniquekey ":" _id "," __autocommitinterval ": null}]}
The mongo-connector-c config.json command is then executed to begin data synchronization.
Searching for data in MongoDB (2)