Building MongoDB Shards
http://gong1208.iteye.com/blog/1622078
Sharding Shard Concept
It is a kind of database cluster system that expands the massive data horizontally, the data table is stored on each node of sharding, and the user can easily construct a distributed MongoDB cluster by simple configuration.
The data chunking of MongoDB is called Chunk. Each chunk is a contiguous data record in the Collection, usually with a maximum size of 200MB, and a new chunk is generated beyond that.
To build a MongoDB sharding Cluster, you need three roles:
Shard Server
That is, the shards that store the actual data, each shard can be an Mongod instance, or a set of replica set that consists of Mongod instances. In order to implement each shard internal Auto-failover,mongodb the official recommendation for each shard is a set of replica set. For more information on how to install and build replica set please refer to my other article http://gong1208.iteye.com/blog/1558355
Config Server
In order to store a particular collection in multiple Shard, you need to specify a shard key for the collection, for example {age:1}, and Shard key determines which chunk the record belongs to. Config servers is used to store the configuration information for all shard nodes, the Shard key range for each chunk, chunk distribution in each Shard, and collection configuration information for all DB and sharding in the cluster.
Route Process
This is a front-end route that the client accesses, then asks config servers which shard to query or save the record, and then connect the corresponding shard to operate, and finally return the results to the client. The client simply sends the query or update request that was originally sent to mongod to routing Process without worrying about which shard the record is stored on.
Below we build a simple sharding Cluster on the same physical machine:
The architecture diagram is as follows:
- Shard Server 1:27,017
- Shard Server 2:27,018
- Config server:27027
- Route process:40000
Implementation steps
Step One:
Start Shard Server
Mkdir-p/OPT/MONGODB/DATA/SHARD/S0--Create data Directory
Mkdir-p/opt/mongodb/data/shard/s1
Mkdir-p/opt/mongodb/data/shard/log--Create log directory
/opt/mongodb/bin/mongod--port 27017--dbpath/opt/mongodb/data/shard/s0--fork--logpath/opt/mongodb/data/shard/log /s0.log--Start Shard Server Instance 1
/opt/mongodb/bin/mongod--port 27018--dbpath/opt/mongodb/data/shard/s1--fork--logpath/opt/mongodb/data/shard/log /s1.log--Start Shard Server Instance 2
Step Two:
Start Config Server
Mkdir-p/opt/mongodb/data/shard/config--Create data Directory
/opt/mongodb/bin/mongod--port 27027–dbpath/opt/mongodb/data/shard/config--fork--logpath/opt/mongodb/data/sha Rd/log/config.log--Start the config server instance
(Note that here we can boot up like a normal mongodb service, without adding the-SHARDSVR and Configsvr parameters.) Because the function of these two parameters is to change the boot port, so we specify the port on our own)
Step Three:
Start Route Process
/opt/mongodb/bin/mongos--port 40000--configdb localhost:27027--fork--logpath
/opt/mongodb/data/shard/log/route.log--chunksize 1--Start a route server instance
MONGOs startup parameters, chunksize This is used to specify the size of the chunk, the unit is MB, the default size is 200MB, in order to facilitate testing sharding effect, we specify Chunksize as 1MB. This means that data transfer begins when the data inserted in this shard is greater than 1M
Step Four:
Configure Sharding
Next, we use the MongoDB shell to log in to MONGOs and add the Shard node
[[email protected] ~]#/opt/mongo/bin/mongo admin--port 40000--this operation requires connection to the Admin library
MongoDB Shell version:2.0.1
Connecting To:127.0.0.1:40000/admin
> Db.runcommand ({addshard: "localhost:27017"})--Add Shard Server
{"shardadded": "shard0000", "OK": 1}
> Db.runcommand ({addshard: "localhost:27018"})
{"shardadded": "shard0001", "OK": 1}
> Db.runcommand ({enablesharding: "Test"})--Set the database for the Shard store
{"OK": 1}
> Db.runcommand ({shardcollection: "Test.users", key: {id:1}})--Sets the collection name of the Shard. And you must specify Shard Key, the system will automatically create an index
{"collectionsharded": "Test.users", "OK": 1}
Note here we have to pay attention to the selection of the chip key, select the key needs according to the specific business data form to choose, must not choose arbitrarily, in fact, in particular, do not choose self-increment _id as a key, unless you are very clear about the purpose of this, specific reasons I do not analyze, according to experience recommend a more reasonable chip key way Self-increment field + query field "Yes, the tablet key can be a combination of multiple fields.
It is also explained here that the basic mechanism of sharding is that shards always attempt to distribute existing data across all shards. For example, now there are two shards, I have chosen the ID as the chip key, assuming that the ID is self-increment, such as 1--10000, then the result of the Shard is evenly divided, that is, 1--5000 in the chip a,5000--10000 in piece B, of course, not necessarily so accurate, but it is guaranteed to be as average as possible, And so on, if there are three shards, equally evenly divided.
It is also necessary to note that when the data is inserted at the beginning, the data is inserted only on one piece of the Shard, after inserting, the inside of MongoDB begins to move the data between the slices, this process may not be immediate, and MongoDB is smart enough to decide whether to move immediately or later depending on the current load.
Immediately after inserting the data, execute the db.users.stats (); two times can be verified as mentioned above.
Ok, the simple Shard is so set up, connect the MONGOs, and then start inserting the data to verify it.
Building MongoDB Shards