This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. I don't understand the American writer KyleBanker Mongodbinaction. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.
This article mainly describes the main principles of the Sharding cluster. Frankly speaking, we just saw that the Sharding system is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, sharding is a horizontal scaling of massive data.
This article describes the main principles of sharding clusters.
Frankly speaking, I just saw this Sharding system, which is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, we can see that sharding is a database cluster system that horizontally extends massive data. Data sharding is stored on each sharding node, you can easily configure a distributed MongoDB cluster.
I. Role description three roles are required to build a MongoDB sharding cluster:
The shard server stores the actual data score slices. Each shard can be a Mongod instance, or a group of mongod instances constitute a Replica Set (the Replica Set described in previous blogs ). To implement auto-failover in each shard, MongoDB officially recommends that each shard be a set of Replica sets. Config Server to store a specific collection in multiple shard, You need to specify a shard key for the collection, for example, {age: 1 }, the shard key determines the chunk of the record. (The chunk will be detailed later.) Config Servers is used to store the configuration information of all shard nodes. The shard key range of each chunk is, the distribution of chunk in each shard, The sharding configuration information of all DB and collection in the cluster, Route Process, a front-end Route, which is connected by the client, then, ask the shard to which the config servers needs to query or save the record, connect to the corresponding shard for operation, and finally return the result to the client. The client only needs to send the query or update request originally sent to mongod to rounting processl without worrying about the shard on which the operation record is stored,
Ii. Framework Structure
If you use a physical machine to build a sharded cluster: The structure is as follows:
The ports on each server are different.
Iii. framework Description
As the sharding cluster is abstract, I can see some instructions on other data and make a supplement here;
A: sharding is used to separate databases on multiple servers.
B: query a user involves two queries. The first access to the configuration database is used to obtain the user's shard location. The second query directly accesses the shard containing user data.
C: mainly solves the problem of resizing and load balancing.
D: The famous framework for manual fragment management is Twitter's Gizzard (see: http://mng.bz/4qvd)
E: determines the current system partition: disk activity, system load, and the ratio of the most important working set size to available memory
F: The concept of chunk blocks: it is located in a continuous partition key range in a shard. They are logical, not physical.
G: partition key: MongoDB shards are range-based. That is to say, each document in the sharding set must fall within a value range of the specified key. The partition key allows each document to locate its position in these ranges.
H: splitting and migration
These two concepts are completely different. The split idea is to divide the split block into two parts when the split block data reaches a certain size. The two parts after splitting have the same number of documents. Splitting is only a logical operation and does not affect the physical order of documents in the sharding set.
Migration is managed by the balancer software. Its task is to ensure that data is evenly distributed across nodes. This function can be achieved by tracking the number of parts. Generally, when the maximum number of parts in a cluster is greater than 8, the balancer performs a balanced processing.
I: Suggested Framework