[ElasticSearch] Principles of distributed document storage (distributed documents store)

Source: Internet
Author: User
Tags hash numeric value

In the previous article, we already know how to store the data in the index and how to retrieve it. But we're masking the technical details of the specific implementation of data storage into the cluster and getting data from the cluster (but we glossed over many technical details surrounding how the data is distributed and Fetched from the cluster). 1. Routing documents into shards (Routing a document to a Shard)

When you index a document, it is stored in a primary shard. But how does Elasticsearch know which shard the document belongs to? When we create a new document, how does it know if it should be stored on Shard 1 or Shard 2.

The process of storing data to shards is a rule, not random, because we need to retrieve the document from the Shard in the future. The data stored procedure depends on the following formula:

Shard = hash (routing)% number_of_primary_shards

The routing value is an arbitrary string, which defaults to the ID of the document and can be set to a user-defined value. Routing this string is processed by a hash function and returns a numeric value, divided by the number of primary shards in the index number_of_primary_shards, the remainder as the primary shard number, usually 0 to Number_of_primary The remainder range between the _shards-1. This method calculates which shard the data is stored in.

This explains why the number of primary shards cannot be changed after the index is created: if the number of primary shards can be modified after creation, all previously obtained values will be invalidated, and previously stored documents may not be found.

All document APIs (GET, index, delete, bulk, update, and Mget) can accept a routing parameter to customize the mapping between the document and the Shard. A custom routing parameter can be used to ensure that all relevant documents-such as all documents belonging to the same user-are stored in the same shard. 2. How primary shards interact with replica shards

Suppose we have a cluster of three nodes. In the cluster there is an index called blog, with two primary shards (primary shards). Each primary shard has two replicas. Copies of the same node are not assigned to the same node, as shown in the following illustration:

We can send requests to any node in the cluster and each node has the ability to process our requests. Each node knows where any document in the cluster is stored, so the request can be forwarded directly to the desired node (every node knows the location of every document in the CLUSTER,SO can forward reques TS directly to the required node).

In the following example, we send the request to Node 1, which we call the coordination node (coordinating nodes). 2.1 Creating, indexing, and deleting documents

Create, index, and delete requests are write operations, so the write operation on the primary shard must be completed before it can be copied to the relevant replica shard (create, index, and delete requests is write operations, which must be suc Cessfully completed on the primary shard before they can is copied to any associated replica shards).

The interactive process is shown in the following figure:

The following are the steps necessary to successfully create, index, and delete documents on primary and replica shards: The client sends a new, indexed, or deleted document request to Node 1; Node 1 determines that the document should be stored in Shard 0 by requesting the ID value of the document, and that the primary shard of Shard 0 is P0 on the node 3 on. So node 1 forwards the request to node 3, and Node 3 executes the request on the primary shard. If the request executes successfully, node 3 forwards the request in parallel to the replica shard (R0) on Node 1 and Node 2. Once all replica shards have successfully executed the request, report success to Node 3 and node 3 to the Coordination node (node 1) for success, and the Coordination node reports success to the client.

When the client receives a successful response, the document change is already performed on the primary shard and all replica shards, and the change is secure.

There are optional request parameters that allow you to influence this process and may improve performance at the expense of data security. These options are rarely used because elasticsearch is fast, but for completeness, it is explained here: 2.1.1 Consistency

By default, the primary shard requires a specified quantity (quorum) or a majority (majority) of the Shard copy shard copies (where the Shard copy can be a primary shard or replica shard a shard copy could be a primary or a re) before the write operation is attempted. Plica shard). This is to prevent the "wrong side of wrong side" from writing data to the network partition. The prescribed quantity quorum is defined as follows:

Int ((primary + Number_of_replicas)/2) + 1

The consistency consistency value can be one (only the primary shard), all (the primary shard and all replicas), or the default value quorum, or most of the Shard copies (the allowed values for consistency is (just The primary shard), all (the primary and all replicas), or the default quorum, or majority, of shard copies.

Note that Number_of_replicas is the number of replicas specified in the index settings, not the number of currently active replicas. If the specified index has three copies, the quorum is defined as follows:

Int ((primary + 3 Replicas)/2) + 1 = 3

However, if you start only two nodes, the active shard copy does not meet the required number, and you will not be able to index or delete any documents. 2.1.2 Timeout

What happens if there are not enough replica shards. Elasticsearch will wait, hoping for more shards to appear. By default, it waits up to 1 minutes. If you want, you can use the timeout parameter to terminate it earlier: 100 100 milliseconds, 30s is 30 seconds.

Note

The new index has 1 replica shards by default, which means that two active Shard replicas should be required to meet the prescribed quantity. However, these default settings prevent us from doing anything on a single node. To avoid this problem, the required amount is only enforced if the Number_of_replicas is greater than 1. 2.2 Retrieving Documents

We can retrieve the document from a primary shard (primary shard) or any copy of them, as shown in the following diagram:

The following is a series of steps required to retrieve a document from a primary shard or replica shard: The client sends a GET request to Node 1, and Node 1 determines that the document is stored in Shard 0 by requesting the document's ID value. Three nodes have a copy of Shard 0 (node 1 on R0, Node 2 on R0, Node 3 on P0). This time, it forwards the request to Node 2. Node 2 returns the document to node 1, and Node 1 returns the document to the client.

For a read request, for each request, the request node selects a different sub-sub-copy to achieve load balancing. All replica shards are polled by polling.

When a document is retrieved, the document that has already been indexed may already exist on the primary shard but has not yet been copied to the replica shard. In this scenario, the replica shard may report that the document does not exist, but the primary Shard may successfully return the document. Once the index request is successfully returned to the user, the document is available in both the primary Shard and the replica shard. 2.3 Local Update documentation

The Updating API (update API) incorporates the two read-write modes explained above, as shown in the following illustration:

Here is a series of steps required to partially update a document: The client sends an update request to Node 1, and Node 1 determines that the document is stored in Shard 0 by requesting the ID value of the document. And know that the primary shard of Shard 0 P0 is located on Node 3. So node 1 forwards the request to Node 3, node 3 retrieves the specified document from the primary shard (P0), changes the JSON in the _source field, and then tries to re-index the document to the primary shard (P0). If someone has modified the document, then repeat step 3 and discard if you try to retry_on_conflict the time without success. If node 3 updates the document successfully, node 3 will send the new version of the document in parallel to the copy shards on Node 1 and Node 2 and re-index the document. Once all replica shards return successfully, node 3 returns to the coordination node successfully, and the coordination node returns success to the client.

Document-based replication

When the primary shard forwards changes to the replica shard, it does not forward the update request. Instead, it forwards the new version of the complete document. Keep in mind that these changes will be forwarded asynchronously to the replica shards, and they cannot be guaranteed to arrive in the same order in which they were sent. If Elasticsearch only forwards change requests, the changes may be applied in the wrong order, resulting in a damaged document. 2.4 Multi-document mode

The Mget and bulk APIs have a pattern similar to single document mode. The difference is that the coordination node knows which shard each document is stored in. It decomposes multi-document requests into multi-document requests for each shard and forwards requests to each participating node in parallel.

Once the answer is received from each node, the response of each node is consolidated into a single response and returned to the client 2.4.1 Mget

As shown in the following illustration:

The following is the sequence of steps required to retrieve multiple documents using a single Mget request: The client sends a MGET request to Node 1. Node 1 builds a multi-document fetch request for each shard, and then forwards the requests in parallel to the node that is hosted on each required primary shard or replica shard. Once all responses are received, Node 1 builds the response and returns it to the client. 2.4.2 Bulk

The bulk API allows multiple creation, indexing, deletion, and update requests to be performed in a single bulk request, as shown in the following illustration:

The bulk API is executed in the following sequence: The client sends a bulk request to Node 1. Node 1 creates a bulk request for each node and forwards those requests in parallel to each node host that contains the primary shard. The primary shard performs each operation sequentially, one after the other. When each operation succeeds, the primary Shard forwards the new document (or delete) to the replica shard in parallel and then performs the next operation. Once all the replica shards report the success of all operations, the node reports success to the coordination node and the coordination node collects and returns the responses to the client.

The bulk API can also use the consistency parameter at the top level of the entire bulk request and use the routing parameter in the metadata in each request.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.