[MongoDB] Build a MongoDB sharding system under the window system (1)

Source: Internet
Author: User
Tags mongodb sharding
This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. I don't understand the American writer KyleBanker Mongodbinaction. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.

This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.

This article describes the main principles of sharding clusters.

Frankly speaking, I just saw this Sharding system, which is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, we can see that sharding is a database cluster system that horizontally extends massive data. Data sharding is stored on each sharding node, you can easily configure a distributed MongoDB cluster.

I. Role description three roles are required to build a MongoDB sharding cluster:

The shard server stores the actual data score slices. Each shard can be a Mongod instance, or a group of mongod instances constitute a Replica Set (the Replica Set described in previous blogs ). To implement auto-failover in each shard, MongoDB officially recommends that each shard be a set of Replica sets. Config Server to store a specific collection in multiple shard, You need to specify a shard key for the collection, for example, {age: 1 }, the shard key determines the chunk of the record. (The chunk will be detailed later.) Config Servers is used to store the configuration information of all shard nodes. The shard key range of each chunk is, the distribution of chunk in each shard, The sharding configuration information of all DB and collection in the cluster, Route Process, a front-end Route, which is connected by the client, then, ask the shard to which the config servers needs to query or save the record, connect to the corresponding shard for operation, and finally return the result to the client. The client only needs to send the query or update request originally sent to mongod to rounting processl without worrying about the shard on which the operation record is stored,

Ii. Framework Structure

If you use a physical machine to build a sharded cluster: The structure is as follows:

The ports on each server are different.

Iii. framework Description

As the sharding cluster is abstract, I can see some instructions on other data and make a supplement here;

A: sharding is used to separate databases on multiple servers.

B: query a user involves two queries. The first access to the configuration database is used to obtain the user's shard location. The second query directly accesses the shard containing user data.

C: mainly solves the problem of resizing and load balancing.

D: The famous framework for manual fragment management is Twitter's Gizzard (see: http://mng.bz/4qvd)

E: determines the current system partition: disk activity, system load, and the ratio of the most important working set size to available memory

F: The concept of chunk blocks: it is located in a continuous partition key range in a shard. They are logical, not physical.

G: partition key: MongoDB shards are range-based. That is to say, each document in the sharding set must fall within a value range of the specified key. The partition key allows each document to locate its position in these ranges.

H: splitting and migration

These two concepts are completely different. The split idea is to divide the split block into two parts when the split block data reaches a certain size. The two parts after splitting have the same number of documents. Splitting is only a logical operation and does not affect the physical order of documents in the sharding set.

Migration is managed by the balancer software. Its task is to ensure that data is evenly distributed across nodes. This function can be achieved by tracking the number of parts. Generally, when the maximum number of parts in a cluster is greater than 8, the balancer performs a balanced processing.

I: Suggested Framework

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.