Reprinted from http://www.cnblogs.com/spnt/
Replica sets Enable Secure Backup of websites and seamless failover of faults, but do not enable massive data storage. After all, physical hardware has limits. distributed deployment is required at this time, save the data to another machine. MongoDB's sharding technology perfectly meets this requirement.
Understand MongoDB's sharding technology, that is, sharding Architecture
What is sharding? To put it bluntly, it refers to the cluster system for horizontal scaling of massive data, and the data sub-tables are stored on each sharding node.
MongoDB data is divided into chunks. Each Chunk is a continuous data record in the collection. Generally, it is 200 MB. If it exceeds the limit, a new data block is generated.
Three roles are required to build a sharding,
Shard server: a shard Server is a shard that stores actual data. Each shard can be a mongod instance or a set of mongod instances, to achieve automatic failover within each shard, MongoDB officially recommends that each shard be a set of replica sets.
Config server: to store a specific collection in multiple Shard, You need to specify a shard key for the collection and decide which chunk the record belongs, the configuration server can store the following information:
1. configuration information of all shard nodes
2. Shard key range of each chunk
3. Distribution of chunks in each shard
4. Shard configuration information for all databases and collections in the Cluster
Route process: a front-end route from which the client accesses it. First, ask the configuration server to query or save the records on that shard, and then connect the corresponding Shard to perform the operation, finally, return the result to the client. The client only needs to send the original query Live Update request sent to mongod to the router process without worrying about the shard where the operation records are stored.
Build sharding
From the above analysis, we can conclude that building a sharding requires at least four MongoDB processes, two shard servers (for sharding), one config server, and one route process, and then arrange the following
Process port file directory
Shard Server 1 2000 mongodb5
Shard Server 2 2001 mongodb6
Config server 30000 mongodb7
Route process 40000 mongodb8
Configure now:
1. Start shard Server
There is only one more command to start: shardsvr. With this command, the process is a shard process.
2. Start config Server
The configsvr command is used to start config server.
3. Start route Process
The chunk size is set to 1 MB to facilitate the test of the partition effect.
4. Configure sharding
After all the processes are started, the rest is to concatenate them into strings.
Open a new CMD and connect to the vro process. Use addshard to add it to the vro.
Through the above two operations, the entire architecture has been stringed, but don't worry, the architecture does not know the sharding database and the partition key yet.
Specify that the sharded database is friends, and then specify the sharding according to the _ ID in the frienduserattach table.
So far, the entire system has been configured.
Verify the sharding. I use the program to insert the data. Because the table is actually used by me, it is too troublesome to insert it in cmd. Here I use the client driver to insert 10000 pieces of data.
Use the use command to switch to the friends database, and then stats to view the current status
Field Description: sharded is true, indicating that the table is sharded.
The shards part has two shard servers: "shard0000" and "shard0001 ". The "shard0000" field count is 1016, indicating that the amount of data distributed on the shard server is 1016. Size indicates the size of the database distributed on the shard server, in B.