mongodb-Shard Slice Key

Source: Internet
Author: User
Tags hash mongodb

1. Sharding

What a shard is. Sharding is the storage of data on multiple machines. When the data set exceeds the capacity of a single server, the server's memory and disk IO are problematic, exceeding the performance bottleneck of a single server. At this point there are two solutions, vertical scaling and horizontal scaling (sharding).

The vertical expansion is to increase the CPU and increase the capacity, but the CPU and capacity of the high-performance system is out of proportion, so the expansion cost is big and there is an upper limit.

Scale shards horizontally, distribute data to multiple servers, each server is a separate database, and each server is combined to form a logical database that distributes write pressure and operations to different servers, increasing capacity and throughput.

MongoDB documents are modeless and do not have a fixed structure, so they can only be horizontally fragmented. When the block exceeds the specified size or the number of documents exceeds the maximum number of documents, MongoDB attempts to split the block, and if the partition succeeds, mark it as a chunk to avoid repeating the split. The key to splitting blocks is the slice key, which describes the types of common slice keys.

2. Chip key Types

A slice key is a property field or a compound index field of a document that cannot be changed once it is established. Chip key is the key of shard splitting data, the choice of chip key directly affects the performance of cluster.

MongoDB first divides the block according to the chip key chunks when the block exceeds the specified size (default 64M), and then divides the block into other shards, the key types are as follows:

Note: The slice key is also an index commonly used when querying.

(1) Increment tablet key

This kind of chip key is more common, such as using time stamp, date, self-increment primary key, objectid,_id, etc., this kind of chip key write operation concentrates on one shard server, writes does not have the dispersibility, this causes the single server pressure is big, but the segmentation is relatively easy, this server may become the performance bottleneck.

Increment slice key creation, use timestamp timestamp shard for bar collection of Foo database

mongos> use foo
mongos> db.bar.ensureIndex ({"Timestamp": 1})
mongos> sh.enablesharding ("foo")
{"OK": 1}
Mongos> sh.shardcollection ("Foo.bar", {"timestamp": 1})
{"collectionsharded": "Foo.bar", "OK": 1}

(2) Hash pad key

The advantage of using a hash index field as a slice key is that the data is distributed evenly across the nodes, and data writes can be distributed randomly to each shard server, and the pressure on each server is distributed. But reading is also random, may hit more shards, generally have the randomness of the chip key (such as password, hash, MD5) query isolation performance is relatively poor.

Hash key creation, using files_id hash shard for chunks collection of Gridfs

Mongos> Db.bar.ensureIndex ({"files_id": "Hashed"})
mongos> sh.enablesharding ("foo")
{"OK": 1}
Mongos> sh.shardcollection ("Foo.fs.chunks", {"files_id": "Hashed"})
{"collectionsharded": "Foo.fs.chunks", " OK ": 1}

(3) Combination sheet key

The database does not have a suitable chip key to choose from, or is intended to use the chip key cardinality is too small (that is, the change is less than 7 days a week can be changed), you can choose another field using the combination of key, or even add redundant fields to combine. It is generally a combination of coarse-grained + fine-grained.

Creation of the composite slice key, using files_id and N combined shards for the chunks collection of Gridfs

Mongos> sh.enablesharding ("foo")
{"OK": 1}
mongos> sh.shardcollection ("Foo.fs.chunks", {"files_id": 1 , "n": 1})
{"collectionsharded": "Foo.fs.chunks", "OK": 1}

(4) Label Shard

The data is stored on the specified shard server, you can add a tag tag for the Shard, and then specify the appropriate tag, such as Let 10.*.*.* (T) appear on the shard0000, 11.*.*.* (Q) appears on shard0001 or shard0002, You can use tag to let the equalizer specify distribution.

Creation of label Shards

MONGOs > Sh.addshardtag ("shard0000", "T")
MONGOs > Sh.addshardtag ("shard0001", "Q")
MONGOs > Sh.addshardtag ("shard0002", "Q")
mongos> sh.addtagrange ("Foo.ips", {"IP": "010.000.000.000", ..., "IP": " 011.000.000.000 "}}," T ")
mongos> sh.addtagrange (" Foo.ips ", {" IP ":" 011.000.000.000 ", ...," IP ":" 012.000.000.000 "}}," Q ")

3. Chip Key selection Strategy

Roughly understand the type of chip key, then how to choose the Tablet key it. Nothing more than two considerations, data query and write, the best effect is that the data query can hit less shards, data writing can be randomly written to each shard, the key is how to weigh the performance and load.

How to choose the key is mainly from the following several issues to consider:

(1) First determine the field of a recurring query

(2) Identify key points that affect the performance of these operations

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.