Mogodb sharding Key

Source: Internet
Author: User
Tags mongodb

MongoDB splits the documents in the collection according to the Sharding key, and then assigns them to the members of the Shard cluster.

The Shard key can be an indexed field or compound index field that exists in each file.

MongoDB uses a different range of sharding key values to split the data in the collection. The different sharding key ranges are non-overlapping and each Shard key range is associated with a chunk.

Select the Sharding key

Select the Shard key to make the chunks smooth distribution to the Shard of the cluster as much as possible. If you do not do that, it will affect the performance of the cluster:

    • Assuming that all chunks are assigned to a shard, the ability of the entire cluster is the ability of this Shard
    • Assuming that the chunks is not evenly distributed and concentrated in one Shard, this shard can be a bottleneck. Because the total time-consuming depends on the slowest shard.

In order to select a good partition key, you also need to understand the following properties of the Sharding key

    • Cardinal Nature
    • Frequency
    • Monotonous change
Cardinal Nature

The cardinality of the Sharding key determines the maximum number of chunks that the balancer can create.

At any given moment, a unique key-value pair can only exist in a chunk with no more than one. There is a sharding key with a cardinality of 4, so there are up to 4 (valid) chunks in the cluster, because adding additional shards does not yield revenue, and each chunk stores a unique Shard key.

However, the high cardinality also does not guarantee that data is distributed smoothly in the cluster, which is also related to frequency and monotonicity. These three factors have to be taken into account when choosing a sharding key.

Frequency

The frequency of the sharding key refers to the number of occurrences of a given Shard key value in a file.

If most files contain only a subset of the sharding keys, then the Shard that stores most of the files becomes a bottleneck for the cluster. If most files contain only one shard key, then the corresponding chunk will be large and indivisible. This reduces the performance of the cluster.

If your data model needs to be in a high-frequency sharding key, consider using a unique or low-frequency composite index instead.

monotonic Change of Sharding key

Monotonic change means that the sharding key is monotonically increasing or monotonically decreasing, so that the sharding key is more easily inserted into a shard in the cluster (rather than evenly distributed).

This happens because each cluster has two chunk that catch the Shard key that is out of bounds. A shard key that captures the maximum value beyond the (sharding key), and a Shard key that captures less than the minimum value.

If a sharding key is monotonically incremented, then after a certain time, all additions will be entered into [maxKey, 正无穷] this chunk. Similarly, the monotonically decreasing sharding key will enter [负无穷, minKey] . Shards that contain corresponding chunk become bottlenecks in write operations.

Unique index

[2] only the entire Shard key is used as the unique index of its prefix to ensure that it is unique across shards.

Hash Shard

Hash shards use a hash index of a field as the Shard key to split the data.

Hash shards provide a more evenly distributed shard cluster at the expense of query isolation. A document that is adjacent to a shard value is more unlikely to be on the same shard, so a query mongos for a given range is more likely to execute a broadcast query. At the same time, MONGOs can match an equal query to a shard.

Hash Shard Key

The field you select as the hash shard key should have a high cardinality. Assuming there is no high cardinality, the data is concentrated on some shards rather than evenly distributed to all shards, and then too many shards of data can cause bottlenecks.

The ideal hash sharding key is a monotonic field, such as ObjectId or time.

Limit size of sharding keys

The size of the Shard key cannot exceed bytes.

Sharding Key Index Type

The index of a sharding key can be an index that is incremented on a shard key, a composite index that is incremented on a shard key prefixed by a shard key, or a hash index.

The Shard key index cannot be a multi-key index, a text index, or a geospatial index on a sharding key field.

The Shard key cannot be changed

If you must change the sharding key:

    • Export all data
    • Drop the Old Shard collection
    • To set a new Shard key
    • Pre-partition the scope of the Sharding key to ensure that the initial distribution is uniform
    • Import data
The Shard key in the document cannot be changed

You cannot modify the value of a shard key in the corresponding field in the text.

Reference
    1. https://docs.mongodb.com/manual/core/sharding-shard-key/#shard-key
    2. https://docs.mongodb.com/manual/reference/limits/#sharded-clusters

Mogodb sharding Key

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.