Elasticsearch How to add, retrieve data

Source: Internet
Author: User

Elasticsearch is a distributed document storage engine. It can store and retrieve complex data structures in real-time-serialized JSON documents. In other terms, once the document is stored in Elasticsearch, it can be retrieved on any node of the cluster.

Of course, we not only need to store data, but also to quickly bulk query. While there are already many NoSQL solutions that allow us to store objects as documents, they still need to consider how to query the data and which fields need to be indexed for faster retrieval.

most entities or objects in a program can be serialized as JSON objects that contain key-value pairs.keys (key)is afields (field)orAttribute (property)'s name,values (value)can be a string, a number, a Bohr type, another object, an array of values, or other special types, such as a string that represents a date or an object that represents a geographic location.

document Meta-data MetaData):

A document is not just data. It also contains metadata (metadata)-information about the document. The three required meta data nodes are:

node Description
_index Where the document is stored
_type The class of the object that the document represents
_id Unique identification of the document
_index

An index is similar to a "database" in a relational database-it's where we store and index associated data.

In fact, our data is stored and indexed in Shard (Shards) , an index is simply a logical space to group one or more shards together. However, this is just some internal detail-our program doesn't care about sharding at all. For our program, the document is stored in index . The rest of the details are cared for by Elasticsearch.

We'll continue to explore how to create and manage the index later, but for now we'll let Elasticsearch create an index for us. The only thing we need to do is choose an index name. The name must be all lowercase, cannot begin with an underscore, and cannot contain commas. Let's use it website as the index name.

_type

In the app, we use objects to represent "things", such as a user, a blog, a comment, or an email. Each object belongs to a class, which defines the property or the data associated with the object. userclass may contain name, gender, age, and email address.

In relational databases, we often store objects of the same class in a table because they have the same structure. Similarly, in Elasticsearch, we use documents of the same type (types) to represent the same "things" because their data structures are the same.

Each type has its own mapping (mapping) or struct definition, just like a column in a traditional database table. Documents under all types are stored under the same index, but the type mapping (mapping) tells Elasticsearch how different documents are indexed. We will explore how to define and manage mappings in the maps section, but now we will rely on Elasticsearch to automate the processing of data structures.

_typeThe name can be uppercase or lowercase, and cannot contain an underscore or a comma. We will use this blog as the type name.

_id

The ID is just a string that, _index _type when combined with and, uniquely identifies a document in Elasticsearch. When creating a document, you can customize it _id or let Elasticsearch help you generate it automatically.

PS: There are other parts of other meta-data, follow-up re-introduction.

Use your own ID

If your document has a natural identifier (such as a user_account field or other value representing a document), you can provide your own _id , using this form of index API:

PUT /{index}/{type}/{id}
{"key": "value"...}
如,PUT /website/blog/123
{
"title": "My blog entry",
"Text": "Chinese You can." ",
"Date": "2015/07/16"
}
{
"_index": "Website",
"_type": "Blog",
"_id": "123",
"_version": 5,
"Created": false
}

each document in the Elasticsearch has a version number, and each time the document changes (including deletion) _version increased. Later we will explore how to use _version numbers to make sure that part of your program does not overwrite changes made by the other part.

Self-Increment ID

If our data does not have a natural ID, we can let elasticsearch automatically generate it for us. The request structure has changed: PUT the Method---- “在这个URL中存储文档” becomes the POST method "在这个文档下存储文档" . (Note: The original is to save the document to an ID corresponding to the space, it is now to add this document to the _type next).

The URL now contains only _index and _type two fields:

POST /website/blog/{  "title""My second blog entry",  "text":  "Still trying this out...",  "date":  "2015/07/16"}

The response is similar to what just happened, and only the _id fields become automatically generated values:

{   "_index":    "website",   "_type":     "blog",   "_id":       "AU6Vi9GsUzILmCnC2hkX",   "_version":  1,   "created":   true}

Update entire document

Documents are immutable in Elasticsearch-we cannot modify them. If you need to update a document that already exists, we can use the API mentioned in the index document to index rebuild the index (REINDEX) or replace it.

PUT /website/blog/123{  "title""My first blog entry",  "text":  "I am starting to get the hang of this...",  "date":  "2014/01/02"}

In response, we can see that the Elasticsearch _version has increased.

{  "_index" :   "website",  "_type" :    "blog",  "_id" :      "123",  "_version"2,  "created":   false <1>}
    • <1> created is identified as a false document that already has the same ID as the index and the same type.

Internally, Elasticsearch has marked the old document for deletion and added a complete new document. The old version of the document will not disappear immediately, but you will not be able to access it. Elasticsearch will clean up the deleted document as you continue to index more data.

In the following discussion of the update API, this API seems to allow you to modify the local parts of the document, but in fact Elasticsearch follows the exact same process as previously said, the process is as follows:

    1. Retrieving JSON from an old document
    2. Modify it
    3. Delete old documents
    4. Index New Document

The only difference is that the update API finishes this process with just one client request, which is no longer needed get and index requested.


Delete a document

The syntax pattern for deleting a document is basically the same as before, except that you want to use the DELETE method:

DELETE /website/blog/1234

If the document is found, Elasticsearch returns the 200 OK status code and the following response body. Note that _version the number has been increased.

{  "found" :    true,  "_index" :   "website",  "_type" :    "blog",  "_id" :      "1234",  "_version"3}

If the document is not found, we will get a 404 Not Found status code, the response body is this:

{  "found" :    false,  "_index" :   "website",  "_type" :    "blog",  "_id" :      "1234",  "_version"4}

Although the document does not exist-the value of "found" is false--_version still increased. This is part of the internal record, which ensures that different operations can be in the correct order between multiple nodes. Deleting a document is not immediately removed from the disk, it is only marked as deleted. Elasticsearch will delete content cleanup in the background when you add more indexes later.





Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Elasticsearch How to add, retrieve data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.