[MongoDB] Build a MongoDB sharding system under the window system (1)

Last Update:2018-06-08 Source: Internet

Author: User

Tags mongodb sharding

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. I don't understand the American writer KyleBanker Mongodbinaction. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.

This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.

This article describes the main principles of sharding clusters.

Frankly speaking, I just saw this Sharding system, which is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, we can see that sharding is a database cluster system that horizontally extends massive data. Data sharding is stored on each sharding node, you can easily configure a distributed MongoDB cluster.

I. Role description three roles are required to build a MongoDB sharding cluster:

The shard server stores the actual data score slices. Each shard can be a Mongod instance, or a group of mongod instances constitute a Replica Set (the Replica Set described in previous blogs ). To implement auto-failover in each shard, MongoDB officially recommends that each shard be a set of Replica sets. Config Server to store a specific collection in multiple shard, You need to specify a shard key for the collection, for example, {age: 1 }, the shard key determines the chunk of the record. (The chunk will be detailed later.) Config Servers is used to store the configuration information of all shard nodes. The shard key range of each chunk is, the distribution of chunk in each shard, The sharding configuration information of all DB and collection in the cluster, Route Process, a front-end Route, which is connected by the client, then, ask the shard to which the config servers needs to query or save the record, connect to the corresponding shard for operation, and finally return the result to the client. The client only needs to send the query or update request originally sent to mongod to rounting processl without worrying about the shard on which the operation record is stored,

Ii. Framework Structure

If you use a physical machine to build a sharded cluster: The structure is as follows:

The ports on each server are different.

Iii. framework Description

As the sharding cluster is abstract, I can see some instructions on other data and make a supplement here;

A: sharding is used to separate databases on multiple servers.

B: query a user involves two queries. The first access to the configuration database is used to obtain the user's shard location. The second query directly accesses the shard containing user data.

C: mainly solves the problem of resizing and load balancing.

D: The famous framework for manual fragment management is Twitter's Gizzard (see: http://mng.bz/4qvd)

E: determines the current system partition: disk activity, system load, and the ratio of the most important working set size to available memory

F: The concept of chunk blocks: it is located in a continuous partition key range in a shard. They are logical, not physical.

G: partition key: MongoDB shards are range-based. That is to say, each document in the sharding set must fall within a value range of the specified key. The partition key allows each document to locate its position in these ranges.

H: splitting and migration

These two concepts are completely different. The split idea is to divide the split block into two parts when the split block data reaches a certain size. The two parts after splitting have the same number of documents. Splitting is only a logical operation and does not affect the physical order of documents in the sharding set.

Migration is managed by the balancer software. Its task is to ensure that data is evenly distributed across nodes. This function can be achieved by tracking the number of parts. Generally, when the maximum number of parts in a cluster is greater than 8, the balancer performs a balanced processing.

I: Suggested Framework

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More