Why do you want to slice
1. Reduce the number of single machine requests, reduce the single load, improve the total load
2. Reduce the storage space of single machine, improve the total memory space.
Common MongoDB sharding Server architecture
To build a MongoDB Sharding Cluster, you need three different roles:
1.Shard Server
that is, a fragment that stores the actual data, each shard can be a mongod instance, or it can be a set of replication set composed of Mongod instances. In order to implement Auto-failover (automatic failover) within each shard, MongoDB officially recommends that each shard be a set of replica set.
2.Config Server
in order to store a particular collection in multiple Shard, you need to specify a shard key for the collection, such as {age:1}, and the Shard key can determine which chunk the record belongs to ( Fragmentation is in chunk, followed by the introduction). Config servers is used to store the configuration information for all shard nodes, the Shard key range for each chunk, chunk distribution in each Shard, collection configuration information for all DB and sharding in the cluster.
3.Route Process
This is a front-end routing, the client this access, and then ask config servers to which shard to query or save records, and then connect the corresponding shard to operate, and finally return the results to the client. The client simply sends the query or update request originally sent to the mongod to the routing Process without concern about which shard the record is stored on. (All operations can be done on MONGOs)
Configuring a fragmented server
Here we build a simple Sharding Cluster on the same physical machine:
Shard server 1:27017
Shard server 2:27018
Config server:27027
Route
1. Step one: Start Shard Server
Mkdir-p./data/shard/s0/data/shard/s1 #创建数据目录
mkdir-p./data/shard/log # Create a log directory
./bin/mongod--port 27017-- Dbpath/usr/local/mongodb/data/shard/s0--fork--logpath/usr/local/mongodb/data/shard/log/s0.log # start Shard Server Instance 1
./bin/mongod--port 27018--dbpath/usr/local/mongodb/data/shard/s1--fork--logpath/usr/local/mongodb/data/ Shard/log/s1.log # Start Shard Server Instance 2
Step 2. Second: Start config Server
Mkdir-p./data/shard/config #创建数据目录
./bin/mongod--port 27027--dbpath/usr/local/mongodb/data/shard/config- Fork--logpath/usr/local/mongodb/data/shard/log/config.log #启动Config Server instance
Note that here we can start like a normal MongoDB service, without adding-SHARDSVR and Configsvr parameters. Since the function of these two parameters is to change the boot port, we can specify the port on our own
3. Step three: Start Route Process
./bin/mongos --port 4000 --configdb localhost:27027 --fork --logpath /usr/local/mongodb/data/shard/log/route.log --chunkSize=1 # 启动Route Server实例
mongos boot parameters, chunksize This is used to specify the size of the chunk, in MB, the default size of 200MB, in order to facilitate the test sharding effect, we specify the chunksize 1MB. This means that data transfer begins when the data inserted in this fragment is greater than 1M
4. Step Four: Configure Sharding
# We use the MongoDB shell to log on to MONGOs and add Shard nodes./bin/mongo Admin--port 40000 #此操作需要连接admin库 ;
Db.runcommand ({addshard: "localhost:27017"}) #添加 Shard Server or with the Sh.addshard () command to add, the same below; {"shardadded": "shard0000", "OK": 1} > Db.runcommand ({addshard: "localhost:27018"}) {"shardadded": "shard0001", "OK": 1} > Db.runcommand ({enablesharding: "Test"}) #设置分片存储的数据库 {"OK": 1} > Db.runcommand ({shardcollection: "t Est.users ", key: {id:1}}) # Sets the fragment's collection name. and must specify Shard key, the system will automatically create the index, and then according to this Shard key to calculate {"collectionsharded": "Test.users", "OK": 1} > Sh.status (); #查看片的状态 > Printshardingstatus (db.getsisterdb ("config"), 1);
# View the status of the piece (full version); > Db.stats (); # View all fragmented server status
Note Here we should pay attention to the choice of key, select the key needs to be based on the specific business data form to choose, must not choose freely, in fact, especially do not easily choose from _id as a key, unless you are very clear about the purpose of doing so, specific reasons I do not analyze this, according to experience recommend a more reasonable way to slice the key, " Self-Added field + Query field, yes, the slice key can be a combination of multiple fields.
Here's another point, fragmentation of the mechanism: MongoDB is not from the level of a single document, the absolute average of scattered on each piece, but n documents, forming a block "chunk", priority placed on a piece, when this piece of chunk, compared to the chunk of another film when the difference is greater (>=3 ), the chunk on the film will be moved to another slice, in chunk as the unit, maintaining the data balance between the slices.
That is, when you first insert the data, the data is inserted only on one piece of the fragment, after inserting, the MongoDB begins to move the data between the slices, the process may not be immediate, mongodb enough intelligence will decide whether to move immediately or later depending on the current load.
After inserting the data, immediately executes Db.users.stats (), two times can verify as mentioned above.
This fragmentation mechanism saves the cost of manual maintenance, but because it is a priority to insert on a piece, wait until the chunk imbalance, then move chunk, and with the increase in data, shard between the instances, there is chunk moving back and forth, which will bring the server a lot of IO overhead, The way to solve this kind of overhead is to manually fragment it beforehand;
Manual Pre-fragmentation
take the Shop.user table as an example:
Sh.shardcollection (' Shop.user ', {userid:1}); # User table with UserID do shard key for
(Var i=1;i<=40;i++) {sh.splitat (' Shop.user ', {userid:i*1000})} # PRE 1K 2K ... 40K such boundaries cut the chunk (although the chunk is empty), these chunk will be evenly moved to each piece.
Add user data through MONGOs. The data will be added to the chunk, and the chunk will not move back and forth.
Repliction Set and Shard
General MongoDB if really to the level of fragmentation, that piece of server to avoid the need to use the replication set, deployment of the basic ideas ibid, only need to pay attention to two points:
Sh.addshard (host) Server:port OR Setname/server:port # If it is a slice server for a replica set, we should copy the name of the set in front of the example
Sh.addshard (' ras/192.168.42.168:27017 '); # 27017 is the primary of the replication set
In addition, when starting the Mongod service, it is best to write IP, otherwise there may be unpredictable errors.
To view the method of fragmentation configuration:
1. Enumerating the databases using fragmentation
to enumerate databases that use fragmentation, you need to query the config database. If the partitioned field value is true, the library uses the fragmentation technique.
Connect a MONGOs instance and run the command to obtain a database using the partitioning feature:
Use config
db.databases.find ({"Partitioned": true})
For example, use the following command to return all databases in a cluster
Use config
db.databases.find ()
If the result is returned:
{"_id": "admin", "partitioned": false, "PRIMARY": "Config"}
{"_id": "MyDB", "partitioned": true, "PRIMARY": "Firstset"}
{"_id": "Test", "partitioned": false, "PRIMARY": "Secondset"}
Show only mydb uses a fragment.
2. Enumerate all the fragments
to enumerate all the fragments of the current collection, use the Listshards command:
Use admin
db.runcommand ({listshards:1})
return Result:
{"
shards": [
{
"_id": "Firstset",
"host": "Firstset/mongo01:10001,mongo01:10002,mongo01:10003 "
},
{
" _id ":" Secondset ",
" host ":" Secondset/mongo01:30001,mongo01:30002,mongo01:30003 "
}
],
"OK": 1
}
3. View the cluster details
to view the details of the cluster, you can use either Db.printshardingstatus () or Sh.status (). All methods return the same result.
For example, use
View information:
---Sharding Status---Sharding version: {"_id": 1, "Version": 4, "Mincompatibleversion": 4, "Curr Entversion ": 5," Clusterid ": ObjectId (" 535a2dab0063b308757e1b70 ")} shards: {" _id ":" Firstset "," host ":" Fir Stset/mongo01:10001,mongo01:10002,mongo01:10003 "} {" _id ":" Secondset "," host ":" Secondset/mongo01:30001,mongo01:30 002,mongo01:30003 "} databases: {" _id ":" admin "," partitioned ": false," PRIMARY ":" config "} {" _id ":" MyDB "
, "partitioned": true, "PRIMARY": "Firstset"} mydb.test_collection Shard key: {"name": 1} Chunks:secondset 6 Firstset 6 {"name": {"$minKey": 1}}--&G t;> {"name": "Cat"} On:secondset Timestamp (2, 0) {"name": "Cat"}-->> {"name": "Cow"} on
: Secondset Timestamp (3, 0) {"name": "Cow"}-->> {"name": "Dog"} On:secondset Timestamp (4, 0) {"Name": "Dog "}-->> {" name ":" Dragon "} On:secondset Timestamp (5, 0) {" name ":" Dragon "}-->> {" NA Me ":" Elephant "} On:secondset Timestamp (6, 0) {" name ":" Elephant "}-->> {" name ":" Horse "} on
: Secondset Timestamp (7, 0) {"name": "Horse"}-->> {"name": "Lion"} On:firstset Timestamp (7, 1) {' name ': ' Lion '}-->> {' name ': ' Pig '} on:firstset Timestamp (1, 7) {' Name ': ' Pig '}-->> {' name ': ' Rabbit '} on:firstset Timestamp (1, 8) {' Name ': ' Rabbit '}-->> {' name ' : "Snake"} on:firstset Timestamp (1, 9) {"Name": "Snake"}-->> {"name": "Tiger"} On:firstset Timestamp (1,) {"Name": "Tiger"}-->> {"name": {"$maxKey": 1}} On:firstset Timestamp (1, 11
{"_id": "Test", "partitioned": false, "PRIMARY": "Secondset"}
(1) Sharding version shows the release number of the fragment metadata.
(2) Shards shows the Mongod instance that is used as a fragment in the cluster.
(3) Databases shows all the databases in the cluster, including libraries that do not use the partitioning feature.
(4) The chunks information shows how many blocks and the range of each block are on each fragment of the MyDB library.