ElasticSearch stores each piece of the data in a document.
That's what I need.
Using the bulk API.
Transform the raw data file from Data.json to be New_data.json .
And then does this to import data to ElasticSearch:
' Localhost:9200/_bulk ' --data-binary @new_data. JSON
For example, I now has a raw JSON data file as following:
The file Data.json
{"Key1": "Valuea_row_1", "Key2": "Valueb_row_1", "Key3": "Valuec_row_1"}
{"Key1": "Valuea_row_2", "Key2": "Valueb_row_2", "Key3": "Valuec_row_2"}
{"Key1": "Valuea_row_3", "Key2": "Valueb_row_3", "Key3": "Valuec_row_3"}
Then I need to import these data to Elasticsearch. So I has to manipulate the this file by naming its index and type.
A new file would be created New_data.json
{"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_1", "Key2": "Valueb_row_1", "Key3": "Valuec_row_1"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_2", "Key2": "Valueb_row_2", "Key3": "Valuec_row_2"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_3", "Key2": "Valueb_row_3", "Key3": "Valuec_row_3"}
There is information above each of the data in the the file New_data.json
And if the JSON data file contains data those is not in the same _index or _type, just change the {"index": {"_******* * Line
A example of a valid JSON file for Elasticsearch.
Full_data.json
{"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Value1", "Key2": "value2", "Key3": "Value3"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "ABCDE", "Key2": "EFG", "Key3": "KLM"} {"Index": {"_index": "Myindex2", "_type": "Mytype2"}} {"Newkey": "NewValue"}
Notice That:there is 2 indexes in the file above. They is myindex1 and myindex2
And the data schema in index myindex2 are different from this in index myindex1 .
That's why it's so important to has so many lines of {" index": {"_******** in the new data file.
-----
Now I am coding a Python scripe to manipulate with some raw JSON data files.
Let's assume each line of the JSON data file is in the same schema. And I'll do this to generate the schema out.
Example_raw_data.json
Import Sysdef Get_schema (): "" "" "" return noneif __name__ = = "__main__": print (Get_schema)
Elasticsearch Data Importing