MongoDB's reading Notes (via3.0) (05) _ "Sharding" (03) _ About Shard keys and the theory of hash of the small ramble

Source: Internet
Author: User

Emergency out of a bad, but can not continue to write, write less, first put Shard key and hash of some small concepts continue to write.

Shard Keys

The most important thing about MongoDB's sharding is Shard key. Shard with Shard Key, and divide and move chunk by the size of the specified chunk (described later). Some features of Shard key:

Shard key cannot be Multikey index


What is Mutikey index? Addr[on the way] This array is called Mutikey index. Shard key can be a complex number of fields, but it cannot be an array-shaped index of the Multikey type.

Hash Shard Key

Anything with a hash word for one purpose: evenly scattered, highly aggregated.
The hash algorithm is used for bulk distribution, and the key is used to retrieve value quickly.

for write-scale scaling

Shard Key has two more important indicators, one is the distribution of a broad base, that is, every collection to have, the other is to maintain "monotonic", such as timestamp, such as MongoDB's own objectid and so on. Suppose we are using a timestamp as Shard Key, which may guarantee a wide range of monotonicity and cardinality, but this continuously stored shard key is likely to make all the data continuously stored in the same chunk, that is, the same shard, We may have 10 shard nodes, but the result is that the first node in the lilt is always stored, while the rest of the nodes are resting.
Therefore, Shard key should also have a certain randomness, that is, in the writing operation should be able to play all the Shard together, similar to lb-like load.

about queries in Shard

For queries, MONGOs is provided interface for application to invoke. After the program is called, MONGO will find the appropriate metadata in config server to locate the Shard and then work on the lookup. So our shard key will directly affect the effect and speed of the query.

about stand-alone lookups in Shard

In general, the quickest query operation is to let MONGOs find a shard. If a query does not contain shard Key, then MONGOs iterates through all the shards. and a full table scan with no indexes in our normal RDB is an effect. And because there are a lot of Shard, this query is similar to the process of aggregating operations again after a decentralized operation, which is time consuming. And if your Shard key is reflected in the search criteria, then even if you need to find a few shard scattered, because MONGOs has a direct, will know where to find, the effect will be much better. So to select a Shard key for collection, first determine which items will be used frequently in the search, find the most effective dependencies in these operations, such as a project may not be the kind of good identification can not be used alone to make the data pick up, Then whether you can add another project to expand the options, expand its purity as "key" and so on.

about sorting

Sorting in shards is actually the process of having a merge after sort. Actually very good think, each shards returns the result, merges together to carry on a sort again.

chunk of the non-divided

Later on, some of the chunk may not be decomposed, which is largely due to the poor granularity of Shard key selection.

We'll take the high-speed train tomorrow. (-?-;) 。。。 Continue writing after continuing.
PS, today to MONGO's Jira again to mention the bug.
Before the silly smacking made a mistake a command to someone else was very polite reply said I made a mistake, but today to see a document to see a simple miss, haha, this time I will not be wrong.

つづく???

MongoDB's reading Notes (via3.0) (05) _ "Sharding" (03) _ About Shard keys and the theory of hash of the small ramble

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.