MongoDB Reading Notes (via3.0) (05) _ [Sharding] (03) _ the theoretical knowledge of Shard Keys and Hash is small, and mongodbsharding

Source: Internet
Author: User
Tags mongodb sharding

MongoDB Reading Notes (via3.0) (05) _ [Sharding] (03) _ the theoretical knowledge of Shard Keys and Hash is small, and mongodbsharding

The Shard key and Hash concepts must be written first.

Shard Keys

The most important thing about mongoDB Sharding is the Shard Key. Shard with the Shard Key and split and move the Chunk according to the specified Chunk size (as described later ). Shard Key features:

The Shard Key cannot be a multikey index.


What is mutikey index? The addr [] array on the way is the so-called mutikey index. The Shard key can be a plural field, but it cannot be an array-shaped multikey index.

Hash Shard Key

Any stuff with the hash character is a purpose: Average scattering, high aggregation.
The hash algorithm is used for bulk distribution and the key is used to quickly retrieve the value.

Write horizontal scaling

The Shard key has two important indicators. One is that the distribution base should be wide, that is, each collection must have one. The other is to maintain "monotonicity", such as timestamp, for example, mongoDB's own ObjectId. Assume that we use a timestamp as the Shard Key, which may ensure monotonicity and a wide base number. However, this continuously stored Shard Key may make all data continuously stored in the same Chunk, that is, in the same Shard, we may have 10 Shard nodes, but the result is that the first node is still being stored, and the rest of the nodes are resting.
Therefore, the Shard Key should also be random, that is to say, during write operations, it should be able to exert all Shard actions together for load similar to LB.

Query in Shard

For queries, mongos provides interfaces for application to call. After the program is called, mongo searches for the corresponding metadata in the config Server to locate the Shard and then performs the search. Therefore, our Shard Key will directly affect the query performance and speed.

About Independent search in Shard

Generally, the fastest query operation is to allow mongos to find a Shard. If a query does not contain a Shard Key, mongos traverses all Shards. This is equivalent to a full table scan without an index in a common RDB. Because there are many Shard instances, this query is similar to the process of performing aggregation again after a distributed operation, which is very time-consuming. If your Shard Key is reflected in the search conditions, even a few Shard keys need to be searched separately, because mongos has directness and will know where to find the Shard Key, and the effect will be much better. Therefore, if you select a Shard key for the collection, you must first determine which items are frequently used during search and find the most effective dependency among these operations, for example, a project may not be well-recognized and can not be used independently to make full use of the data pick up. Can another project group be added to expand optionality, increase the purity of the key.

About sorting

The sorting in Shards is actually a sort process after merge. As a matter of fact, after each Shards returns a result, merge goes together to execute another sort.

Unsharded Chunk

Some chunks may not be decomposed, which is largely caused by poor granularity of Shard key selection.

We will take the high-speed train tomorrow. (-。-;)... Continue writing later.
Ps. Today, I submitted another Bug to mongo's jira ..
I was so stupid that I was mistaken. I gave a command to someone who was very polite and replied that I was wrong. But today I saw a simple miss in this document. Haha, I won't make a mistake this time.

There are already too many other...

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.