[MongoDB] Build a MongoDB sharding system under the window system (1), windowmongodb
This article describes the main principles of sharding clusters.
Frankly speaking, I just saw this Sharding system, which is a bit too tall. Seeing us writer Kyle Banker's "Mongodb in action" does not understand. First, describe the data with the parts. From other books, we can see that sharding is a database cluster system that horizontally extends massive data. Data sharding is stored on each sharding node, you can easily configure a distributed MongoDB cluster.
I. Role description three roles are required to build a MongoDB sharding cluster:
- The shard server stores the actual data score slices. Each shard can be a Mongod instance, or a group of mongod instances constitute a Replica Set (the Replica Set described in previous blogs ). To implement auto-failover in each shard, MongoDB officially recommends that each shard be a set of Replica sets.
- Config Server to store a specific collection in multiple shard, You need to specify a shard key for the collection, for example, {age: 1 }, the shard key determines the chunk of the record. (The chunk will be detailed later.) Config Servers is used to store the configuration information of all shard nodes. The shard key range of each chunk is, distribution of chunk in each shard, sharding configuration of all DB and collection in the Cluster
- Route Process is a front-end Route. The client accesses this Route and then queries the shard to which the config servers needs to query or save the record, connects to the shard to perform operations, and finally returns the result to the client. The client only needs to send the query or update request originally sent to mongod to rounting processl without worrying about the shard on which the operation record is stored,
Ii. Framework Structure
If you use a physical machine to build a sharded cluster: The structure is as follows:
The ports on each server are different.
Iii. framework Description
As the sharding cluster is abstract, I can see some instructions on other data and make a supplement here;
A: sharding is used to separate databases on multiple servers.
B: query a user involves two queries. The first access to the configuration database is used to obtain the user's shard location. The second query directly accesses the shard containing user data.
C: mainly solves the problem of resizing and load balancing.
D: The famous framework for manual fragment management is Twitter's Gizzard (see: http://mng.bz/4qvd)
E: determines the current system partition: disk activity, system load, and the ratio of the most important working set size to available memory
F: The concept of chunk blocks: it is located in a continuous partition key range in a shard. They are logical, not physical.
G: partition key: MongoDB shards are range-based. That is to say, each document in the sharding set must fall within a value range of the specified key. The partition key allows each document to locate its position in these ranges.
H: splitting and migration
These two concepts are completely different. The split idea is to divide the split block into two parts when the split block data reaches a certain size. The two parts after splitting have the same number of documents. Splitting is only a logical operation and does not affect the physical order of documents in the sharding set.
Migration is managed by the balancer software. Its task is to ensure that data is evenly distributed across nodes. This function can be achieved by tracking the number of parts. Generally, when the maximum number of parts in a cluster is greater than 8, the balancer performs a balanced processing.
I: Suggested Framework
MongoDb eats memory. After mongoDb is started, the memory will rise to 95%. I have 4 GB of memory and I have upgraded mongo to 208. The problem remains unsolved.
By default, this database will eat a large amount of memory for caching. Currently, it seems that there is no way to limit the memory usage. Therefore, we recommend that you do not place the database and other programs on one machine.
Storage Capacity of MongoDB in 32-bit Systems
Data Files can only store 2 GB, but they are a single data file. Whether the operating system is 32-bit or 64-bit for memory, the data files stored in the hard disk are not directly related to the Operating System (related to the disk storage format and file management system ).
The 32-bit operating system can only access the memory space of more than three GB, while mongodb is a memory database. In mode, the data of the mongo database is stored in the memory. The prison 32-bit operating system memory access restrictions, so when you start mongod, there will be a 32-bit operating system warning prompt.
However, mongodb is not so stupid. For data files stored on disks, mongo will automatically divide the files. However, this Division will not be split after it is fully stored, files are divided according to certain index rules. So what you see is that if your database is larger than 2 GB, you may see more divided files.