Elasticsearch Data Importing

Source: Internet
Author: User

ElasticSearch stores each piece of the data in a document.

That's what I need.

Using the bulk API.

Transform the raw data file from Data.json to be New_data.json .

And then does this to import data to ElasticSearch:

' Localhost:9200/_bulk ' --data-binary @new_data. JSON

For example, I now has a raw JSON data file as following:

The file Data.json

{"Key1": "Valuea_row_1", "Key2": "Valueb_row_1", "Key3": "Valuec_row_1"}
{"Key1": "Valuea_row_2", "Key2": "Valueb_row_2", "Key3": "Valuec_row_2"}
{"Key1": "Valuea_row_3", "Key2": "Valueb_row_3", "Key3": "Valuec_row_3"}

Then I need to import these data to Elasticsearch. So I has to manipulate the this file by naming its index and type.

A new file would be created New_data.json

{"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_1", "Key2": "Valueb_row_1", "Key3": "Valuec_row_1"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_2", "Key2": "Valueb_row_2", "Key3": "Valuec_row_2"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Valuea_row_3", "Key2": "Valueb_row_3", "Key3": "Valuec_row_3"}


There is information above each of the data in the the file New_data.json

And if the JSON data file contains data those is not in the same _index or _type, just change the {"index": {"_******* * Line

A example of a valid JSON file for Elasticsearch.

Full_data.json

{"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "Value1", "Key2": "value2", "Key3": "Value3"} {"Index": {"_index": "Myindex1", "_type": "Mytype1"}} {"Key1": "ABCDE", "Key2": "EFG", "Key3": "KLM"} {"Index": {"_index": "Myindex2", "_type": "Mytype2"}} {"Newkey": "NewValue"}


Notice That:there is 2 indexes in the file above. They is myindex1 and myindex2

And the data schema in index myindex2 are different from this in index myindex1 .

That's why it's so important to has so many lines of {" index": {"_******** in the new data file.

-----

Now I am coding a Python scripe to manipulate with some raw JSON data files.

Let's assume each line of the JSON data file is in the same schema. And I'll do this to generate the schema out.

Example_raw_data.json

Import Sysdef Get_schema (): "" "" ""    return noneif __name__ = = "__main__":    print (Get_schema)

Elasticsearch Data Importing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.