[Elasticsearch] Distributed File Storage

Source: Internet
Author: User

[Elasticsearch] Distributed File Storage

This article is translated from the distributed document store chapter in the Elasticsearch official guide.

Distributed Document Storage

In the previous chapter, we have been introducing index data and methods for getting data. However, we omit many technical details about how data is Distributed and Fetched in the cluster. This is actually intended-you really don't need to know how data is distributed in ES. It is enough to work.

In this chapter, we will go deep into these internal technical details to help you understand how your data is stored in a distributed system.


Route a Document to a Shard)

When you index a document, it is saved to a Primary Shard. How does ES know which part the document should be saved? When we create a new document, how does ES know whether it should be saved to shard 1 or shard 2?

This process cannot be random, because we may need to obtain this document in the future. In fact, this process is determined by a very simple formula:

Shard = hash (routing) % number_of_primary_shards

The preceding routing value is an arbitrary string. It is set to the _ id field of the document by default, but can also be set to another specified value. The routing string will be passed into a Hash Function to get a number. Then, the number is calculated using a modulo operation with the number of major shards in the index to obtain the remainder. The remainder must always be in the range of 0 andnumber_of_primary_shards - 1Is the part number that a document is stored in.

This explains why the number of primary parts in the index can only be specified when the index is created and cannot be changed in the future: if the number of primary parts changes after the index is created, then all the previous routing results will be changed incorrectly, resulting in the document being incorrectly obtained.

Users sometimes think that fixing the number of major shards will make it difficult to Scale Out indexes in the future. In fact, some technologies allow you to easily scale horizontally as needed. We will introduce these technologies in Designing for scale.

All document APIs (get, index, delete, buli, update, and mget) accept a routing parameter, which is used to customize the ing from documents to parts. A specific routing value ensures that all relevant documents, such as all documents belonging to the same user, are stored on the same part. We will explain in detail why you might do this in Designing for scale.


How the Primary Shard (Primary Shard) interacts with the Replica Shard (Replica Shard)

To solve this problem, assume that we have a Cluster with three nodes ). It contains an index named blogs with two major shards. Each primary Shard has two replica shards. Two shards with the same data will never be assigned to the same node. Therefore, the composition of this cluster may be like this:

Serial/uPa92rXjyc + Serial + s/08L2gyPgo8cD48L3A + Serial/serial + Serial + 9sbS52MGqtcS4sbG + Serial = "http://www.2cto.com/uploadfile/Collfiles/20141119/20141119084124279.png" alt = "\">

Shows the process. Next we will list every step of creating, indexing, and deleting a document in the plot:

  1. The Client sends a request to node 1 for creation, indexing, or deletion.
  2. Nodes use the _ id field of this document to determine that it should belong to part 0. Therefore, the request is forwarded to node 3 because the primary shard of shard 0 is currently allocated to node 3.
  3. Node 3 will execute this request on the main part of the document. If the execution succeeds, the request is concurrently forwarded to node 1 and node 2 where the corresponding replica Shard is located. Once all the replica shards have successfully completed the request, Node 3 will report the successful execution to the request Node (Requesting Node, then Node 3 can send a successful response to the client.

    When the client receives the response of successful execution, the sent document has been successfully updated in the primary shard and all its associated copy shards. Now, your modifications are complete.

    In this process, some optional parameters are used to adjust the process, which may increase performance at the cost of data security. ES itself is fast enough, so these optional parameters are rarely used, but they will be explained for integrity:

    Replication

    replicationThe default value issync. It will cause the primary Shard to wait for the successful execution response on the replica Shard, and then send the successful execution response to the request node. If youreplicationSetasyncIt will cause a successful response to be sent to the client after the request is successfully executed on the primary shard. It will still forward the request to the node where the replica Shard is located, but you will not be able to know whether the request can be successfully executed on the replica shard. The purpose of this option is to prevent you from using it. DefaultsyncValue allows ES to process data pressure in various systems. HoweverasyncElasticsearch may overload elasticsearch because it has sent too many requests that do not need to be completed.

    Consistency

    By default, the primary part needs to be arbitrated (Quorum), that is, to confirm that most of the part copies (the part copy can make the primary part or the copy part, both) are valid, to initiate a write operation. This aims to prevent data from being written to the "Wrong Side (Wrong Side)" on the network )". Arbitration is defined as follows:

    Int (primary + number_of_replicas)/2) + 1

    consistencyThe value can beone(Only main parts ),all(Primary shard and all replica shards), or the defaultquorum-Most multipart copies.

    Note:number_of_replicasIs the number of replica shards in the index settings, not the number of active replica shards. If you specify three replica shards in the indexquorumThe value is:

    Int (primary + 3 replicas)/2) + 1 = 3

    When only two nodes are startedquorumAs a result, you cannot index or delete any documents.

    Timeout

    What if there is not enough multipart copy? ES will wait, and more shards will appear. By default, it will wait for 1 minute. If necessary, you can set the time to a shorter value: 100 indicates 100 milliseconds, and 30 seconds indicates 30 seconds.

    NOTE: A New Index will have one replica shard by default.quorumTwo active part copies are required. However, when elasticsearch runs on a single node cluster, these default settings prevent users from performing any useful operations (such as index and write operations ). To prevent this problem, only whennumber_of_replicasWhen the value is greater than 1,quorumTo be satisfied.


    Get document

    The document can be obtained through the Primary Shard or any copy Shard.

    The document retrieval process is displayed. Each step is explained as follows:

    1. The Client sends a request to node 1.
    2. This node uses the _ id field of the document to determine that the document belongs to part 0. The multipart copy (primary or replica part) of shard 0 exists on all three nodes. This time, it forwards the request to node 2.
    3. Node 2 returns the document to node 1, and node 1 then returns the document to the client.

      For Read requests, each Request Node selects a different multipart copy to achieve load balancing-All multipart copies are used cyclically.

      This may occur. When a document is being indexed, the document is ready for the primary part, but has not been copied to other copy parts. At this time, the copy part may not exist in the report document (Note: There is a read request to get this document). However, the main part can successfully return the required document. Once the INDEX request returns a successful response to the user, the document is available for the primary shard and all replica shards.


      Partial Update)

      The update API performs a partial update based on read and write operations.

      The following sections describe the update steps:

      1. The client sends an update request to node 1.
      2. Node 1 forwards requests to node 3 because the main parts are allocated to the node.
      3. Node 3 obtains the corresponding document from the main part, modifies the _ source Field in the JSON file, and tries to Reindex the modified document in the main part ). If this document has been modified by another processretry_on_conflictSet the number of retries.
      4. If node 3 can successfully update the document, it will forward the new version of the document to node 1 and node 2 where the copy part is located in parallel, they are also allowed to perform the re-indexing operation. Once all the replica nodes are successfully executed, Node 3 sends the message to the request Node (Requesting Node, which is Node 1 ). Then the request node returns a response to the client.

        The update API is acceptable.routing, 'Replicase', 'consistency ', and 'timeout' parameters.

        Document-based replication when a primary shard forwards modifications to its replica Shard, it does not convey the update request. It forwards the complete documentation of the new version. Remember that the requests forwarded to the replica shards are asynchronous, that is, the order in which they arrive and the order in which they are sent are uncertain. If elasticsearch only forwards the changes, the changes may be accepted in an incorrect order, resulting in data corruption.


        Multi-Document Patterns)

        The behavior modes of mget and bulk APIs are similar to those of a single document operation. The main difference is that the Request node knows that each Document is saved on that part, so it can split a Multi-Document Request into multiple Document requests for each part, then, these requests are concurrently forwarded to the corresponding node.

        Once it gets answers from each node, it organizes these answers into a separate response and returns it to the client.

        To obtain multiple documents using an mget request, follow these steps:

        1. The client sends an mget request to node 1.
        2. Node 1 creates an mget request for each shard (which can be the primary shard or replica shard) and forwards the request to the node where other shards are located in parallel. Once Node 1 obtains all the results, the results are assembled into a response and finally returned to the client.

          You can set the routing parameter for each document by inputting a docs array.

          Follow these steps to create, index, delete, and update multiple documents using a bulk request:

          1. The client sends a bulk request to node 1.
          2. Node 1 creates a bulk Request for each shard (only the primary shard) and forwards the request to the node where the other primary shards are located in parallel.
          3. The main parts will execute commands in bulk requests one by one. After each instruction is completed, the main part forwards the new document (or deleted) to all the associated copy parts in parallel, and then runs the next instruction. Once all the replica shards determine that all the commands are successful, the current Node will send a successful response to the request Node, finally, the request node sorts out all the responses and finally sends the responses to the client.

            The bulk API can also be accepted at the top of the entire requestreplicationAndconsistencyParameter, which is accepted in each specific requestroutingParameters.



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.