Some things are recorded and will be used again soon.
1. Updating complex nested objects from MongoDB to SOLR
Recently using Mongo-connector to open MongoDB with SOLR, for simple JSON, update is no problem. The problem we have this time is that if the value of one field in JSON is an array type, or if there is an object nested within it, then what is the whole thing? For example, we insert such a shipment in MongoDB:
{ "_id": "555df36ec6cd08ea807a4324", "name": "Xiaomi Phone", "comments": [ { "text": "The phone is genuine", }, { "text": "Logistics is so damn fast", } ]}
This thing can be synced to SOLR, my palms really sweat a sweat. After many attempts, very unfortunate, the vigorous failure.
Helpless, go to our great official note to take a look at it, HTTPS://GITHUB.COM/10GEN-LABS/MONGO-CONNECTOR/WIKI/USAGE%20WITH%20SOLR, which has a section: Key Names and Document flattening, a few days before the description of this paragraph is not good, really pit dad ah. Well, anyway, it looks like this thing has a way of looking at the JSON above, as explained above, the JSON should be converted to the following form:
{ "_id": "555df36ec6cd08ea807a4324", "name": "Xiaomi Phone", "Comments.0.text": "The phone is genuine", "Comments.1.text ":" Logistics is so damn fast "}
Then submit it to SOLR. And we don't see anything in the official documentation that needs to be configured, is it OK to install it by default? oh,no~~~~
This is why, we look at the official description of the Schema.xml, the original key point in this, probably means that mongo-connector will read this configuration file, before the data submitted to SOLR, the data will not be declared in the Schema.xml field is removed.
Oh, so, it should be the shape of "comments.0.text" such as the field is not declared by the Schemal.xml, so was removed, not seen in SOLR. Then we have the right remedy, since we do not declare this domain, we declare not good, in the schema.xml added:
<field name= "Comments.0.text" type= "Text_mmseg4j_complex" indexed= "true" stored= "true"/>
Schema.xml's path references the previous article. Restart SOLR, keep mongo-connector Open, re-insert the above JSON into MongoDB, wow, OK, we can see in SOLR front end "Comments.0.text", the great revolutionary journey finally took the first step!
After the success, you may also want to show "Comments.1.text", as follows. So, the question comes, if I have multiple objects in the array, infinitely more, do I have to declare all possible fields?
This question is really very good to ask! We have to say that Schema.xml is a very important thing!
This time, we need to use another thing in Schemal.xml, DynamicField. This thing looks like it can be used with multiple names, according to the existing example. So, according to our needs, we add this:
<dynamicfield name= "comments*" type= "Text_mmseg4j_complex" indexed= "true" stored= "true"/>
This means that it is clear that all the comments* fields are declared so that we can clean sweep all the comments*.
If you want to further explore how Mongo-connector does this, you can refer to the Python file:
/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/solr_doc_manager.py
To this, we have completed the great mission of renewing this.
2. Query the nested objects and arrays in SOLR
As described in 1, we can now see the updated fields on the SOLR front end, and these fields have been indexed by SOLR, so how can we query them, for example, we want to use the keyword "authentic" in all shapes such as "Comments.*.text" in the field to hit "Phone is genuine" This value. It is regrettable that SOLR provides us with the various means of querying the values, but does not give us the tools to specify the Target field, we are just like "Comments.*.text", but in fact, we are not able to specify the domain to search in SOLR.
At this time, have to say again, schema.xml akzent is too important, we used the inside another thing Copyfield. This thing can look at Schema.xml in the comments, probably meaning is to build the index when the value of the source can be added to dest, in turn query dest also can query the source, the key is that more than one source can copy to a dest Ah, this does not fit my meaning. Having said so much, we only need to add one sentence to the schema.xml:
<copyfield source= "comments*" dest= "text"/>
This dest field "text" must be set to Multivalue= "true" Oh, otherwise mongo-connector will error, think should also, so many source to a dest, don't ask how I know, this is really I try to come out of , the middle process is not much to say.
According to the above, this time we look at the text field, we will go to the coments* domain lookup, set SOLR df to text,q as the keyword, bash it, Sao years, you will get what you want.
Well, the yards are so much, the key point is so many, too many pits, be careful!
On Mongo-connector How to update the JSON array and nested objects in MongoDB to the SOLR engine