Elasticsearch index (company) _ Centos CURL addition, deletion, and modification, elasticsearchcurl

Source: Internet
Author: User

Elasticsearch index (company) _ Centos CURL addition, deletion, and modification, elasticsearchcurl
Directory

Returned Directory: http://www.cnblogs.com/hanyinglong/p/5464604.html

1. Elasticsearch index description

A. I have learned about the installation and configuration, basic concepts, and communication methods of Elasticsearch through the previous blogs. After learning about the content, I can use it and learn about it, it is also applied to projects. Starting from this blog, we will use a simple tutorial to learn about Elasticsearch. Through this tutorial, we hope that you can understand what Elasticsearch can do and how easy it is to use. For more in-depth content, please try it.

B. the previous Entity object (Employee, which defines the actual entity class in chapter 2) describes the entire tutorial. Since there is an entity class, we must first store the data of the company's employees, each document represents an employee. The act of storing data in Elasticsearch is called indexing. However, before indexing, we need to determine where the data should be stored.

B .1 In Elasticsearch, a document belongs to one Type (Employee), and these types exist in the Index (Company )), comparison between the storage structures of the following databases and Elasticsearch can be used to understand the storage structure of Elasticsearch.

(1) Relational DB (Relational Database) --> Databases (Database (Company) --> Tables (Table (Employee) --> Rows (ROW) --> Colums (column) (The following two descriptions of attributes)

(2) Elasticsearch --> Indices --> Types --> Documents --> Fields

Note: The Elasticsearch cluster can contain multiple indexes (databases). Each index can contain multiple Types (tables). Each type contains multiple Documents) (rows), and then each document contains multiple Fields (Fields) (columns), which can be referenced and understood with our relational database.

C. We often mention indexes in Elasticsearch, but you find it hard to understand what the index is? This is because the Index has different meanings in Elasticsearch. Here we will briefly differentiate it:

C.1 an Index is like a database in No. 4 traditional relational database. It stores relevant documents. The index is composed of indices or indexes.

C.2 an index (verb) refers to a document that stores a document in the index so that it can be retrieved.

C.3 Inverted indexes traditional relational databases add an index for a specific column to accelerate the search. Elasticsearch and Lucene use a data structure called Inverted index to achieve the same purpose.

D. By default, all fields in the Elasticsearch document will be indexed (with an inverted index). Only in this way can their quick search be accelerated.

E. Let's create an employee index for the company. To create this employee index, we will perform the following operations:

E.1 creates an index for each employee's Document. Each Document contains all information about the employee.

E.2 the type of each document is "employee", "employee" belongs to the index company, and "company" indexes are stored in the Elasticsearch cluster.

F. Next we will add, delete, and modify indexes.

2. Elasticsearch index Creation Document (initialization)

A. the document is indexed through the index api so that data can be stored and searched. As we described earlier, the document is uniquely identified by its _ index, _ type, and _ id, the previous _ index and _ type need to be defined by ourselves, while _ id can be defined by ourselves or we can use index api to generate a default for us. The index creation syntax is:

Curl-XPUT 'HTTP: // 192.168.37.htm: 9200/{index}/{type}/{id }? Pretty '-d '{

"Field": "value"

}'

A.1 from the syntax, we can see path:/{index}/{type}/{id }? Pretty contains four parts: index name, type: type name, id: Id of the employee of the company. pretty indicates that JSON is returned, it does not require additional management work, such as creating an index or defining the data type of each field. Elasticsearch can directly index the document. Elasticsearch has all built-in default settings, all management operations are transparent.

B. Use your own Id

B .1 if your document has a natural identifier (similar to the primary key in a database) (such as the Id in Employee), you can provide your own _ id. For example, we add a data record, the index name is company, the type name is employee, and the Id is: e449576b-2125-49e2-99ee-5985212cf502, then the request and return of this index are as follows:

     

Note: As shown in: the response indicates that the requested index has been successfully created. This index contains _ index, _ type, _ id metadata, and _ version. Each document in Elasticsearch has a version number. Every time a document changes (including deletion), the version number is increased.

C. System Auto-increment Id

C.1 if our data does not have an auto-incremental Id, Elasticsearch can be automatically generated for the data. The request structure has changed: Replace the put method with the post method. The URL only needs to contain the _ index and _ type fields. Add another piece of data, as shown in:

    

Note: The automatically generated ID contains 22 characters long, URL-safe, Base64-encoded string universally unique identifiers, or UUIDs.

D. Here we have created the index. Next, we will describe the document on adding, modifying, and deleting indexes. We have created an index database with the company index and the employee index type.

3. Create Elasticsearch documents

A. the index and type have been created in step 2, so we need to create a new document. In fact, it has been described above, but there are still some problems here, A brief description of a node is drawn.

B. When indexing a document, how can we determine whether to create a new index or overwrite an existing index? We need to remember that _ index, _ type, and _ id uniquely determine a document, so to ensure that the document is newly added, the simplest way is to use the POST method to make Elasticsearch automatically generate a unique _ Id (this ensures that each _ id is different). However, if we want to use a custom _ id, elasticsearch must be notified that the request must be accepted only when _ index, _ type, and _ id are both at the same time. To achieve this, there are two ways to achieve this:

B .1 use the op_type parameter (syntax is as follows ):

Curl-XPUT 'HTTP: // 192.168.37.133: 9200/company/employee/22dd91d9-e92d-4fe7-a5e0-48fbbdd130f7? Op_type = create & pretty '-d '{

Object: (Object object (the field is the same as above), and re-write the value defined by the field (to facilitate subsequent queries ))
}'

B .2 Add the _ create feature directly after the URL

Curl-XPUT 'HTTP: // 192.168.37.small: 9200/company/employee/fc6304c9-a257-4920-a756-f02fee7ac157/_ create? Pretty '-d '{

Object: (Object object (the field is the same as above), and re-write the value defined by the field (to facilitate subsequent queries ))

}'

Note: If a new document is successfully Created in the request, Elasticsech returns the normal metadata and the Creation status is Created: true.

C. Of course, if a document containing the same _ index, _ type, and _ id already exists, Elasticsearch returns the 409 response body, and the error message is very obvious, as shown in:

   

D. now we have explained the two cases of adding documents to the index, and several statements have been written in the index database, at this time, we enter a few more data records for subsequent tests.

4. Elasticsearch updates the entire document

A. we can see from the above that we have created an index for Elasticsearch and added part of the document data, so when there is data, we will update and delete the data, here we will briefly talk about how Elasticsearch updates the entire document. We will also talk about the partial update of Elasticsearch later.

B. documents are immutable in Elasticsearch, that is, when we input data into Elasticsearch, we cannot modify them. If we need to update an existing document, you can use the index API mentioned above to re-index or replace it. In other words, you can retrieve the data from the document and modify it. Then, delete the old document and re-index the new document.

B .1 from the figure above, we can see that Elasticseach contains a user named "yangqi". We changed its name to "yangxia" and its hobby was "newspaper, code, and writing". whether the formal staff for "true", account Email for "yangdianfeng@live.cn", (compared with the above picture content ):

C. Analysis content:

C.1 we can see that in the response, Elasticsearch has added _ version, indicating that the modification has been successful. There will be a special article about _ version, which is about Elasticsearch version control.

C.2 created is marked as false because the document with the same Id already exists under the same index and type. Do not check that the modification fails if the created is set to false. The basic version growth feature _ version knows that it has been modified successfully.

C.3 after using the preceding modification command, the document is deleted and added to Elasticsearch after it is marked internally. Of course, the old version will not disappear immediately, but you cannot access it, elasticsearch will delete it under certain circumstances.

C.4 the update API will be discussed later in "partial update". This API allows you to modify the partial content of the document. However, in fact, the execution process of Elasticsearch is consistent with the previously mentioned process: retrieve the data to be modified from the original document, modify it, delete the original document, and index the new document. The only difference is that the update API only needs one client request to complete the process, no get or index requests are required.

D. as mentioned above, Elasticsearch is marked as deleting it not immediately, but deleting it under specific circumstances. The specific situation is: elasticsearch regularly performs segment merging merge operations based on Lucene merge rules. Generally, you do not need to worry about or take any action. When a deleted document is merged, it will be deleted. Before it is deleted again, it will still occupy resources such as JVM heap and the file cache of the operating system.

4. Delete Elasticsearch documents

A. the syntax for deleting a document is basically the same as that before, except that the DELETE method is used: the syntax is as follows:

Curl-XDELETE 'HTTP: // 192.168.37.133: 9200/company/employee/AVRksJ2CE1KWUdOka6rK /? Pretty'

B. After executing the preceding command, if the document is found, Elasticsearch returns the status information and response body. Note that the number of _ version changes. If not found, the response body with the found of false is returned.

    

C. if you continuously execute commands that do not exist in the document and find that although the document does not exist (found is false), the value of _ version is increased, which is part of the internal record, it ensures that operations on multiple nodes can be performed in the correct order.

D. As mentioned in the update document, deleting a document does not immediately remove it from the disk, but is marked as deleted. Elasticsearch will be deleted under specific circumstances. For specific situations, refer to the instructions in the modification.


Through this article, we have learned that ELasticsearch is used for indexing and creating, deleting, and modifying index documents using CURL. In the next blog, we will simply use CURL to query index documents.

 

Every day is a little progress

If any problem exists in the article, you are welcome to point it out and I will modify it as soon as possible.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.