The sharding principle adopted by mongodb is actually very simple. To put it bluntly, a cake is very big and requires a very large box to be loaded. It is not convenient even if it is stored, therefore, the big cake is cut into small cakes for storage. this idea is also applied in other applications or databases. for example: mysql partition, fastdfs group 1, mongodb partition, two common types
The sharding principle adopted by mongodb is actually very simple. To put it bluntly, a cake is very big and requires a very large box to be loaded. It is not convenient even if it is stored, therefore, the big cake is cut into small cakes for storage. this idea is also applied in other applications or databases. for example: mysql partition, fastdfs group 1, mongodb partition, two common types
The sharding principle adopted by mongodb is actually very simple. To put it bluntly, a cake is very big and requires a very large box to be loaded. It is not convenient even if it is stored, therefore, the big cake is cut into small cakes for storage. this idea is also applied in other applications or databases. for example:Mysql Partition,Fastdfs Group
I. mongodb sharding, two common architectures
1. Each client system contains a router mongos,
Each client has a route
The number of servers has increased from a dozen to several hundred. The mongos route and the mongod shard server have established several hundred or even thousands of connections, and the load is very heavy. This means that each chunk is balanced (the balancing measures required by the MongoDB sharding cluster to maintain even data distribution) it takes a long time to transmit the chunk location information stored in the configuration database. This is because every mongos route must clearly know where each chunk exists in the cluster.
2. vro independent
Route Server
The routing independently solves the problem mentioned above. The fewer routes, the simpler the topology.
The router here includes the connection pool FunctionThe above two images are from the network
Ii. Server description
1. mongodb multipart Storage
Server1 dual: Shard1/Shard2 ip: 192.168.10.103
Server 2: Shard1/Shard2 ip: 192.168.10.219
2. Routing
Server3 ip: 192.168.10.209
From the configuration of the vro, we can see that only independent routing is used.
Iii. install and configure mongodb replica sets shard
1. mongodb Installation
: Http://www.mongodb.org/downloads
After decompression, you can use
2. Create directories on the three servers respectively.
# mkdir /usr/local/mongodb26/configsvr/ -p# mkdir /usr/local/mongodb26/log/ -p# mkdir /usr/local/mongodb26/shard1/ -p# mkdir /usr/local/mongodb26/shard2/ -p# mkdir /usr/local/mongodb26/key/ -p# mkdir /usr/local/mongodb26/conf/ -p
Why create a directory? Because the decompressed mongodb has a directory bin directory, which is too small.
3. Create a partition configuration file for 192.168.10.103 and 192.168.10.219.
# Cat/usr/local/mongodb26/conf/shard1.confdbpath =/usr/local/mongodb26/shard1 // database path directoryperdb = true // set that each database will be saved in a separate directory shardsvr = true // start the multipart replSet = shard1 // set the full set name to shard1, replSet is to let the server know that there are other machines in this "shard1" Replica set port = 20001 // port number oplogSize = 100 // copy the log size MBpidfilepath =/usr/local/mongodb26/shard1/ mongodb. pid // pid file path logpath =/usr/local/mongodb26/log/shard1.log // log Path logappend = true // write data to profile = 1 by append // database analysis, 1 indicates that only slow operations are recorded. slowms = 5 // The time settings are considered as slow queries. The root mysql slow query is a bit like fork = true // run as a daemon, create a server process
Shard2 root shard1 is basically the same
# cat /usr/local/mongodb26/conf/shard2.confdbpath = /usr/local/mongodb26/shard2directoryperdb = trueshardsvr = truereplSet = shard2port = 20002oplogSize = 100pidfilepath = /usr/local/mongodb26/shard2/mongodb.pidlogpath = /usr/local/mongodb26/log/shard2.loglogappend = trueprofile = 1slowms = 5fork = true
The two machines in the multipart storage, respectively, create these two files.
4, 192.168.10.209 route Server
The route has two configuration files.
# Cat/usr/local/mongodb26/conf/configsvr. conf // route configuration, configuration file pidfilepath =/usr/local/mongodb26/configsvr/mongodb. piddbpath =/usr/local/mongodb26/configsvrdirectoryperdb = trueconfigsvr = true // declare that this is the config service of a cluster. The default port is 27019, default directory/data/configdbport = 20000 logpath =/usr/local/mongodb26/log/configsvr. loglogappend = truefork = true
# Cat/usr/local/mongodb26/conf/mongos. conf // route configuration, configuration file configdb = 192.168.10.209: 20000 // configuration server for the listener, only one or three ports = 30000 chunkSize = 1 // use 100 or delete to generate an environment in mb, after deletion, the default value is 64 logpath =/usr/local/mongodb26/log/mongos. loglogappend = truefork = true
4. Start the mongodb Process
1. Start of the 192.168.10.103 and 192.168.10.219 multipart storage nodes
/usr/local/mongodb26/bin/mongod -f /usr/local/mongodb26/conf/shard1.conf/usr/local/mongodb26/bin/mongod -f /usr/local/mongodb26/conf/shard2.conf
2. Start on the 192.168.10.209 route Node
/usr/local/mongodb26/bin/mongod -f /usr/local/mongodb26/conf/configsvr.conf/usr/local/mongodb26/bin/mongos -f /usr/local/mongodb26/conf/mongos.conf
If the following error is reported:
#/Usr/local/mongodb26/bin/mongos-f/usr/local/mongodb26/conf/mongos. conf
BadValue need either 1 or 3 configdbs
Try '/usr/local/mongodb26/bin/mongos -- help' for more information
Causes and solutions:
The configdb parameter in the mongos. conf file is incorrect. Here, the number of routes can only be 1 or 3. Configdb = 192.168.10.209: 20000
5. Check whether the startup is successful.
1. Start of the 192.168.10.103 and 192.168.10.219 multipart storage nodes
[root@localhost conf]# netstat -tpnl |grep mongodtcp 0 0 0.0.0.0:20001 0.0.0.0:* LISTEN 21556/mongodtcp 0 0 0.0.0.0:20002 0.0.0.0:* LISTEN 21976/mongod
2. Start on the 192.168.10.209 route Node
[root@localhost mongodb26]# netstat -tpnl |grep montcp 0 0 0.0.0.0:30000 0.0.0.0:* LISTEN 3521/mongostcp 0 0 0.0.0.0:20000 0.0.0.0:* LISTEN 3946/mongod
If all listening ports have been started, it indicates that they have been added successfully.
Sat,Configure mongodb replica set Node
1. Configure port 20001
# /usr/local/mongodb26/bin/mongo --port 20001> use admin> config = {_id:"shard1", members: [ {_id: 0, host:"192.168.10.103:20001"}, {_id: 1, host:"192.168.10.219:20001"} ]};{ "_id" : "shard1", "members" : [ { "_id" : 0, "host" : "192.168.10.103:20001" }, { "_id" : 1, "host" : "192.168.10.219:20001" } ]}> rs.initiate(config);> exit;
2. Configure port 20002
# /usr/local/mongodb26/bin/mongo --port 20002> use admin> config = {_id:"shard2", members: [ {_id: 0, host:"192.168.10.103:20002"}, {_id: 1, host:"192.168.10.219:20002"} ]};{ "_id" : "shard2", "members" : [ { "_id" : 0, "host" : "192.168.10.103:20002" }, { "_id" : 1, "host" : "192.168.10.219:20002" } ]}> rs.initiate(config);> exit;
Perform the multipart, 192.168.10.103, and 192.168.10.219 operations on any machine. The two ports are configured once.
7. mongos route section configuration
# /usr/local/mongodb26/bin/mongo --port 30000mongos> use adminmongos> db.runCommand({addshard:"shard1/192.168.10.103:20001,192.168.10.219:20001",name:"shard1", maxsize:20480} )mongos> db.runCommand({addshard:"shard2/192.168.10.103:20002,192.168.10.219:20002",name:"shard2", maxsize:20480} );mongos> db.runCommand( {listshards : 1 } ){ "shards" : [ { "_id" : "shard1", "host" : "shard1/192.168.10.103:20001,192.168.10.219:20001" }, { "_id" : "shard2", "host" : "shard2/192.168.10.103:20002,192.168.10.219:20002" } ], "ok" : 1}
Here, the replica set + shard function is configured. Note: although the configuration is complete, you must declare that the database and table must be sharded.
8. declare that the database and table must be sharded.
Mongos> use admin // switch to the admin database mongos> db. runCommand ({enablesharding: "test2"}); // declare that the test2 database allows sharding {"OK": 1} mongos> db. runCommand ({shardcollection: "test2.books", key: {id: 1}); // declare that the books table is sharded {"collectionsharded": "test2.books", "OK ": 1} mongos> use test2 // switch to test2mongos> db. stats (); // view the database status mongos> db. books. stats (); // view the table status
If the following error is reported:
{"OK": 0, "errmsg": "sharding not enabled for db"}, because the library does not allow sharding
9. Test
1. Test script
mongos> for (var i = 1; i <= 20000; i++) db.books.save({id:i,name:"12345678",sex:"male",age:27,value:"test"});
2. Test Results
mongos> db.books.stats();{ "sharded" : true, "systemFlags" : 1, "userFlags" : 1, "ns" : "test2.books", "count" : 20000, "numExtents" : 10, "size" : 2240000, "storageSize" : 5586944, "totalIndexSize" : 1267280, "indexSizes" : { "_id_" : 678608, "id_1" : 588672 }, "avgObjSize" : 112, "nindexes" : 2, "nchunks" : 4, "shards" : { "shard1" : { "ns" : "test2.books", "count" : 5499, "size" : 615888, "avgObjSize" : 112, "storageSize" : 2793472, "numExtents" : 5, "nindexes" : 2, "lastExtentSize" : 2097152, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 1, "totalIndexSize" : 367920, "indexSizes" : { "_id_" : 196224, "id_1" : 171696 }, "ok" : 1 }, "shard2" : { "ns" : "test2.books", "count" : 14501, "size" : 1624112, "avgObjSize" : 112, "storageSize" : 2793472, "numExtents" : 5, "nindexes" : 2, "lastExtentSize" : 2097152, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 1, "totalIndexSize" : 899360, "indexSizes" : { "_id_" : 482384, "id_1" : 416976 }, "ok" : 1 } }, "ok" : 1}
3. Test results and data analysis
1), data records, all inserted to shard2
2), shard2 data remains unchanged, and shard1 Data begins to increase.
3), shard1 data remains unchanged, and shard2 data is reduced.
Data records are inserted at a time, which can be balanced in about 5 seconds. You can repeat this command in db. book. stats (), and you will find data flow changes.
The sharding principle adopted by mongodb is actually very simple. To put it bluntly, a cake is very big and requires a very large box to be loaded. It is not convenient even if it is stored, therefore, the big cake is cut into small cakes for storage. this idea is also applied in other applications or databases. for example: mysql partition, fastdfs group 1, mongodb sharding, two common architectures 1, each client system contains a router mongos, the number of servers has increased from a dozen to several hundred. The mongos route and the mongod shard server have established several hundred or even thousands of connections, and the load is very heavy. This means that each chunk is balanced (the balancing measures required by the MongoDB sharding cluster to maintain even data distribution) it takes a long time to transmit the chunk location information stored in the configuration database. This is because every mongos route must clearly know where each chunk exists in the cluster. 2. The independent router routing solves the problem mentioned above. The fewer routes, the simpler the topology. the router here includes the connection pool function. the above two images are from Network 2. Server description 1: mongodb sharded storage Server1 dual: Shard1/Shard2 ip: 192.168.10.103Server2 dual: Shard1/Shard2 ip: 192.168.10.2192, route Server3 ip: 192.168.10.209 from the configuration above, we can see that only the independent routing solution is used. iii. mongodb replica sets shard installation and configuration 1. mongodb installation: export mkdir/usr/local/mongodb26/configsvr/-p # mkdir/usr/local/mongodb26/log/-p # mkdir/usr/local/mongodb26/shard1/-p # mkdir/usr/local/mongodb26/shard2/-p # mkdir/usr/local/mongodb26/key/-p # mkdir [...]