Elasticsearch indexing and document operations,

Source: Internet
Author: User

Elasticsearch indexing and document operations,
List all indexes

Now let's take a look at our indexes.

GET /_cat/indices?v

The returned content is as follows:

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.sizeyellow open   .kibana XYZPR5XGQGWj8YlyZ1et_w   1   1          1            0      3.1kb          3.1kb

You can see that there is an index in the cluster.

Create an index

Now let's create an index named customer and list all the indexes again.

PUT /customer?prettyGET /_cat/indices?v

Execute the first line and return the following content. Here we use the PUT predicate to create an index named customer, followed by pretty, indicating that if data is returned, formatted JSON will be used to return data.

{  "acknowledged": true,  "shards_acknowledged": true}

Execute the second row and return the following content. The result shows that an index named customer has been created, which has five primary shards and one copy shard (one by default ), there are no documents in this index.

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.sizeyellow open   .kibana  XYZPR5XGQGWj8YlyZ1et_w   1   1          1            0      3.1kb          3.1kbyellow open   customer M8i1ZxhsQJqk7HomOA7c_Q   5   1          0            0       650b           650b

You may have noticed that the healthy value of the customer index is marked as yellow. Looking back at the content we discussed earlier, yellow indicates that the duplicate parts (copies) of the index are not yet allocated. This is because Elasticsearch creates one copy for the index by default, because at this time we only have one node, this copy cannot be allocated (for high availability) until another node is added to the cluster. Once the copy is assigned to another node, the health status of the index changes to green.

Index and query documents

Next, let's put something into the customer index. As mentioned earlier, to index a document, we must tell Elasticsearch which type of the document should belong to the index. Next we will index a simple document to the customer index, type name: external and ID: 1

PUT /customer/external/1?pretty{  "name": "John Doe"}

The returned content is as follows:

{  "_index": "customer",  "_type": "external",  "_id": "1",  "_version": 1,  "result": "created",  "_shards": {    "total": 2,    "successful": 1,    "failed": 0  },  "created": true}

From the above, we can see that a new customer document is successfully indexed to the extenal type of the customer index, and the internal id value of the document is 1 When indexing.

It is worth noting that Elasticsearch does not need to explicitly create an index before you index a document. For example, in the previous example, if the customer index does not exist, Elasticsearch automatically creates the index.

Let's take a look at the document we just indexed.

GET /customer/external/1?pretty

The returned content is as follows:

{  "_index": "customer",  "_type": "external",  "_id": "1",  "_version": 1,  "found": true,  "_source": {    "name": "John Doe"  }}

The special field here is the found field. It indicates that we have found a document with id 1 and another special field _ source, which stores the document indexed in the previous step.

Delete Index

Now let's Delete the created index and view all the indexes again.

DELETE /customer?prettyGET /_cat/indices?v

The first line returns the following content:

{  "acknowledged": true}

The second row returns the following content:

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.sizeyellow open   .kibana XYZPR5XGQGWj8YlyZ1et_w   1   1          1            0      3.1kb          3.1kb

From the above content, we can see that our customer index has been deleted.

Before continuing to learn, let's take a quick look at the API commands learned in this section.

PUT /customerPUT /customer/external/1{  "name": "John Doe"}GET /customer/external/1DELETE /customer

If you carefully read the preceding commands, you will find the mode in which elasticsearch accesses data, which is summarized as follows:

<REST Verb> /<Index>/<Type>/<ID>

The REST access mode is widely used in all API commands. If you can simply remember it, you have made a good start for understanding Elasticsearch.

Modify data

Elasticsearch provides near real-time operations and data query capabilities. By default, it takes about one second to update or delete your data to search for new results (based on the refresh frequency ). Unlike platforms like SQL, SQL data takes effect immediately after the transaction is completed without delay.

Index/replace document

I have demonstrated how to index a single document before. Let's review it again:

PUT /customer/external/1?pretty{  "name": "John Doe"}

The above command will index the external type of the specified document to the customer index. The Document id value is 1. If we execute the above command again with different document content (or the same), elasticsearch will replace the old document (I .e. re-indexing) with a new document ).

PUT /customer/external/1?pretty{  "name": "Jane Doe"}

The above operation changes the name field of the document with id 1 from "john doe" to "jane doe ". On the other hand, if we use different IDs to execute the above command, a new document will be created, and the old document will remain unchanged.

PUT /customer/external/2?pretty{  "name": "Jane Doe"}

The above operation indexes a new document with id 2.

When you index a new document, the id value is optional. If this parameter is not specified, elasticsearch generates a random id for the document. The actually generated id is saved in the returned results of the api call.

The following example shows how to index a document without specifying a Document id:

POST /customer/external?pretty{  "name": "Jane Doe"}

The returned content is as follows:

{  "_index": "customer",  "_type": "external",  "_id": "AVyc9L6dtgHksqXKpTlM",  "_version": 1,  "result": "created",  "_shards": {    "total": 2,    "successful": 1,    "failed": 0  },  "created": true}

Note: In the above example, because no id is specified, we need to use POST predicate to replace the previous PUT predicate.

Update document

In addition to being able to index and replace documents, we can also update documents. Note that elasticsearch is not updated based on the original document. When updated, Elasticsearch will delete the old document and then index the new document. The following example shows how to update the document and change the name field with ID 1 to "Jane Doe ":

POST /customer/external/1/_update?pretty{  "doc": { "name": "Jane Doe" }}

The following example shows how to update the document with the previous ID 1 and add the age field when the name field is changed to "Jane Doe ".

POST /customer/external/1/_update?pretty{  "doc": { "name": "Jane Doe", "age": 20 }}

You can also use a simple script to execute updates. The following example uses a script to increase the age by 5:

POST /customer/external/1/_update?pretty{  "script" : "ctx._source.age += 5"}

In the preceding example, ctx. _ source indicates the source document to be updated. Note that only one document can be updated at a time when writing this article. In the future, Elasticsearch may provide query conditions (suchSQL UPDATE-WHEREStatement) to update multiple documents.

Delete document

The following example shows how to delete a document whose ID is 2 under the customer index.Delete By Query APIDelete all documents that match a specific query. It is worth noting that directly deleting an index is more efficient than deleting all documents through the query api.

DELETE /customer/external/2?pretty
Batch Processing

In addition to indexing, updating, and deleting a single document, Elasticsearch also provides the ability to use the _ bulk API to Batch Execute any of the preceding operations. This function is very important because it provides a very effective mechanism to perform multiple operations as quickly as possible and minimize the network round-trip. For example, two documents are indexed in a bulk operation:

POST /customer/external/_bulk?pretty{"index":{"_id":"1"}}{"name": "John Doe" }{"index":{"_id":"2"}}{"name": "Jane Doe" }

The returned content is as follows:

{  "took": 27,  "errors": false,  "items": [    {      "index": {        "_index": "customer",        "_type": "external",        "_id": "1",        "_version": 1,        "result": "created",        "_shards": {          "total": 2,          "successful": 1,          "failed": 0        },        "created": true,        "status": 201      }    },    {      "index": {        "_index": "customer",        "_type": "external",        "_id": "2",        "_version": 1,        "result": "created",        "_shards": {          "total": 2,          "successful": 1,          "failed": 0        },        "created": true,        "status": 201      }    }  ]}

The following example updates the first document and deletes the second document in one operation:

POST /customer/external/_bulk?pretty{"update":{"_id":"1"}}{"doc": { "name": "John Doe becomes Jane Doe" } }{"delete":{"_id":"2"}}

The returned content is as follows:

{  "took": 25,  "errors": false,  "items": [    {      "update": {        "_index": "customer",        "_type": "external",        "_id": "1",        "_version": 2,        "result": "updated",        "_shards": {          "total": 2,          "successful": 1,          "failed": 0        },        "status": 200      }    },    {      "delete": {        "found": true,        "_index": "customer",        "_type": "external",        "_id": "2",        "_version": 2,        "result": "deleted",        "_shards": {          "total": 2,          "successful": 1,          "failed": 0        },        "status": 200      }    }  ]}

Note that the above deletion operation does not have the corresponding source document after it, because only the Document ID is required to be deleted.

If an operation fails for some reason, the subsequent operation will not be affected and it will continue to perform the remaining operations. When the api returns the result, each operation provides a status (consistent with the received order). You can use this status to check whether the operation is successful.

Official documentation

Https://www.elastic.co/guide/en/elasticsearch/reference/current/_exploring_your_cluster.html

Https://www.elastic.co/guide/en/elasticsearch/reference/current/_modifying_your_data.html

References

Https://github.com/13428282016/elasticsearch-CN/wiki/es-gettting-started

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.