I. Overview
Sharding is a method of allocating data on multiple machines. MongoDB uses shards to support very large data sets and high throughput operations. There are two ways to address system growth: vertical scaling and horizontal scaling.
vertical scaling involves increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space. The amount of concurrent access and storage capacity supported by a single machine can be limited in hardware cost and hardware performance . Therefore, the vertical extension is the maximum limit.
horizontal scaling includes allocating system datasets and loads to multiple servers, adding additional servers to increase capacity as needed. Although the overall speed or capacity of a single machine may not be high, each machine can handle a portion of the entire workload, which can be more efficient than a single high-speed large-capacity server. And many times you can choose a low-cost PC , and the total cost may be lower than the high-end hardware of a single machine , but it adds to the complexity of maintenance.
MongoDB Version: 3.6
Second, shard cluster architecture 1). Shard
The meaning of a shard is the process of splitting the data and spreading it across different machines, and MongoDB's shard mechanism allows you to create a cluster of many machines, scattered subsets of data in the cluster, and each shard maintains a subset of the data set. Using the Shard cluster architecture enables applications to have greater data processing power than stand-alone servers and replica sets.
Note: Each shard consists of a replica set. After the MongoDB 3.6 version, the Shard must be a replica set.
2). Configure the server
The configuration server is the brain of the whole cluster, which holds the meta-database of cluster and Shard, such as: Shard information, cluster database information, shard collection information, block information, balancer information, version information, cluster operation log, related setting information, etc. Therefore, the configuration server data must be saved on a non-volatile drive. Each configuration server should be located on a separate physical machine, preferably geographically distributed, with the ability to enable logging.
Note: After the Mongos3.4 version, the configuration server must also be a replica set.
3). MONGOs process
MONGOs provides an interface between a client application and a shard cluster.
Deploying multiple MONGOs supports high availability and scalability; The common pattern is that MONGOs deploys one on each application server, one on each application server, to reduce the network latency between the application and the other. Or, you can deploy MONGOs on a dedicated server. This method is typically used by large deployments because it separates the client application server from the MONGOs, which gives better control over the number of connections to the Mongod instance.
You can deploy Mongos,mongos on the primary shard without sharing memory with the Mongod instance. You need to be aware of the problems that may result from memory contention.
You can theoretically deploy countless mongos routes. However, because MONGOs routing often communicates with the configuration server, the performance of the configuration server should be monitored closely when increasing the number of MONGOs. If you see performance degradation, you should now mongos the number of
Third, deploy Shard 1). Environment
192.168.137.10:rs-a-1:27010;rs-a-2:27011;rs-a-3:27012
192.168.137.20:rs-b-1:28010;rs-b-2:28011;rs-b-3:28012
192.168.137.30:config-1:29010,config-2:29011;config-3:29012;mongos:30000
Rs-a Shard Replica Set configuration
--------------rs-a-1 Configuration-------------------------Pidfilepath=/rs-a-1/Mongod.pidlogpath=/rs-a-1/data/log/Mongod.logdbpath=/rs-a-1/data/Dblogappend=truebind_ip=192.168.137.10,127.0.0.1Port=27010Fork=trueAuth=trueReplset= rs-Ashardsvr=truekeyfile=/rs-a-1/AutoKey--------------rs-a-2 Configuration-------------------------Pidfilepath=/rs-a-2/Mongod.pidlogpath=/rs-a-2/data/log/Mongod.logdbpath=/rs-a-2/data/Dblogappend=truebind_ip=192.168.137.10,127.0.0.1Port=27011Fork=trueAuth=trueReplset= rs-Ashardsvr=truekeyfile=/rs-a-2/AutoKey--------------rs-a-3 Configuration-------------------------LogPath=/rs-a-3/data/log/Mongod.logdbpath=/rs-a-3/data/Dblogappend=truebind_ip=192.168.137.10,127.0.0.1Port=27012Fork=trueAuth=trueReplset= rs-Ashardsvr=truekeyfile=/rs-a-3/autokey
Rs-b Shard Replica Set configuration
--------------rs-b-1 Configuration-------------------------Pidfilepath=/rs-b-1/Mongod.pidlogpath=/rs-b-1/data/log/Mongod.logdbpath=/rs-b-1/data/Dblogappend=truebind_ip=192.168.137.20,127.0.0.1Port=28010Fork=trueAuth=trueReplset= rs-Bshardsvr=truekeyfile=/rs-b-1/AutoKey--------------rs-b-2 Configuration-------------------------Pidfilepath=/rs-b-2/Mongod.pidlogpath=/rs-b-2/data/log/Mongod.logdbpath=/rs-b-2/data/Dblogappend=truebind_ip=192.168.137.20,127.0.0.1Port=28011Fork=trueAuth=trueReplset= rs-Bshardsvr=truekeyfile=/rs-b-2/AutoKey--------------rs-b-3 Configuration-------------------------Pidfilepath=/rs-b-3/Mongod.pidlogpath=/rs-b-3/data/log/Mongod.logdbpath=/rs-b-3/data/Dblogappend=truebind_ip=192.168.137.20,127.0.0.1Port=28012Fork=trueAuth=trueReplset= rs-Bshardsvr=truekeyfile=/rs-b-3/autokey
Config configuration server configuration
--------------config-1 Configuration-------------------------Pidfilepath=/config-1/Mongod.pidlogpath=/config-1/data/log/Mongod.logdbpath=/config-1/data/Dblogappend=truebind_ip=192.168.137.30,127.0.0.1Port=29010Fork=trueAuth=trueConfigsvr=trueReplset=Configkeyfile=/config-1/AutoKey--------------config-2 Configuration-------------------------Pidfilepath=/config-2/Mongod.pidlogpath=/config-2/data/log/Mongod.logdbpath=/config-2/data/Dblogappend=truebind_ip=192.168.137.30,127.0.0.1Port=29011Fork=trueAuth=trueConfigsvr=trueReplset=Configkeyfile=/config-2/AutoKey--------------config-3 Configuration-------------------------LogPath=/config-3/data/log/Mongod.logdbpath=/config-3/data/Dblogappend=truebind_ip=192.168.137.30,127.0.0.1Port=29012Fork=trueAuth=trueConfigsvr=trueReplset=Configkeyfile=/config-3/autokey
MONGOs Routing Configuration
ConfigDB = config/192.168.137.30:29010,192.168.137.30:29011,192.168.137.30:29012Port=30000LogPath=/mongos/log/route.logbind_ip=192.168.137.30,127.0.0.1Logappend=trueFork=truekeyfile=/mongos/Autokeymaxconns=20000
Note: The distribution of nodes is unreasonable for the sake of understanding at present, if it is the real generation environment, each data node should be separated.
2). Shard Configuration
1. Start All shard replica sets (Rs-a,rs-b) and configuration server (config)
You can refer to the article that I wrote earlier to build a replica set.
MongoDB to build a replicable cluster: http://www.cnblogs.com/chenmh/p/8484049.html
2. Start MONGOs Routing
MONGOs--config/mongos/mongos.conf
3. Shard Configuration
Log in verification
30000 Use Admindb.auth ("dba","dba")
Adding shards
Sh.addshard ("rs-a/192.168.137.10:27010,192.168.137.10:27011,192.168.137.10:27012" ); Sh.addshard ("rs-b/192.168.137.20:28010,192.168.137.20:28011,192.168.137.20:28012 ");
Document Sharding
Use adminsh.enablesharding ("test"); Sh.shardcollection (" Test.person", {_id:1});
Sh.enablesharding ("News"); Sh.shardcollection ("News.person ", {"username":"hashed"});
Note: There are 1,-1,hashed three kinds of shards in this way;
1. If the Shard collection is an empty collection, you can create an index on the Shard field by default on the collection Shard, without having to create the index in advance.
2. If the Shard collection is a non-empty collection, you need to manually create the index.
3. If a Shard collection has a unique index, the Shard must be a key on a unique index.
Specific reference: https://docs.mongodb.com/manual/reference/method/sh.shardCollection/index.html
Inserting test data
Use test; for (var i=0;i<100000; i++) {Db.person.insert ({"_id": I,"username":" User "+i,"createdate":new Date ()})}
Use news; for (var i=0;i<100000; i++) {Db.person.insert ({"_id": I,"username":" User "+i,"createdate":new Date ()})}
3). Query
Sh.status ();
Note: You can see that the database blocks on the Rs-a;rs-b two shards are the same, distributed evenly, and then look at the distribution of the database blocks after adding a new shard.
4). Add a new Shard
Note: After adding a new Shard, MONGOs again moves the data block to ensure that the data blocks are evenly distributed on each shard. The movement of the database is asynchronous.
Iv. Summary
MONGOs takes an asynchronous approach to moving chunks of data to other shards, which cannot be set too small, otherwise all writes to a dense-write system are concentrated on one shard, and finally the data block is moved asynchronously to other shards.
Note: pursuer.chen Blog:http://www.cnblogs.com/chenmh This site all the essays are original, welcome to reprint, but reprint must indicate the source of the article, and at the beginning of the article clearly to the link, otherwise reserves the right to hold responsibility. Welcome to the exchange of discussions |
MongoDB Shard Cluster Construction