MongoDB sharding cluster with replica set

Source: Internet
Author: User
Tags mongodb sharding

MongoDB 1.6 provides the sharding and replica set technology to enable MongoDB to be deployed in the production environment.
Sharding

1. MongoDB Cluster Structure
A MongoDB cluster consists of the following services:
A. the sharding Service (shard server), mongod instance, and more than two are responsible for storing the actual data shards. in the production environment, a shard server can be replaced by a replica set composed of several servers, avoid single point of failure (spof) on the host.
B. More than one configuration server (config server) and mongod instance are responsible for storing the metadata of the entire cluster.
C. the routing Service (routing process), mongos instance, more than one, the client is connected through the frontend routing, and the entire cluster looks like a single database, client applications can be used transparently, routing process does not store data, and the data comes from config server.
2. sharding Mechanism
A. Shard keys: MongoDB divides Data Based on shard keys. Shard keys can be composed of one or more physical key values in the document.
B. chunks: a chunk is a set of data. In MongoDB, Chunk is represented as a collection, Minkey, and maxkey triple, A chunk contains a set of documents with the shard key value between Minkey and maxkey, but the chunk does not store actual data. The chunk metadata of MongoDB is stored in the chunks colecttion of the config database.
C. autosplit and movechunk: When the chunk size reaches chunksize (200 MB by default, which can be set when mongos is started, mongoDB Splits a chunk into two chunks based on a median value of Minkey and maxkey. After the split, MongoDB splits the Chunk Based on the load of each shard, decide whether to move the new chunk to another shard (movechunk ).
3. Configuration
Take a single machine with two Shard, one config, and one mongos as an example.
A. Start shard Server
Part 1
./Bin/mongod -- shardsvr -- Port 27017 -- dbpath./data/shard1 -- oplogsize 10000 -- logpath./data/shard1.log -- logappend -- fork
Part 2
./Bin/mongod -- shardsvr -- Port 27018 -- dbpath./data/shard2 -- oplogsize 10000 -- logpath./data/shard1.log -- logappend -- fork
The shardsvr option indicates that mongod is started as a part.
Port indicates the service port of mongod.
Dbpath indicates the data storage location of mongod.
Oplogsize indicates the oplog collection size of mongod, in MB. Each operation of MongoDB writes oplog first, and then the actual operation. In the replica set mode, this log is used for synchronization between servers. When the network speed is slow, it should be scaled up properly.
Logpath indicates the location where mongod outputs logs.
Logappend indicates that logs are output in append mode.
Fork indicates that mongod will be started later
B. Start config Server
./Bin/fig./data/config-port 20000 -- logpath./data/config. log -- logappend-Fork
The Config server is actually a mongo database. Besides using the configsvr option to indicate the config identity, the other is the same as the shard server.
C. Start the routing process
./Bin/mongos-configdb localhost: 20000 -- Port 30000 -- logpath./data/mongos. log -- logappend -- fork
To start mongos, specify the location of the config server.
D. Configure parts.
./Bin/Mongo localhost: 30000
Use the Mongo client to connect to mongos
> Use admin;
Switch to admin Database
> DB. runcommand ({addshard: "[IP]: 27017", name: "S1 "});
> DB. runcommand ({addshard: "[IP]: 27018", name: "S2 "});
Add shard with [IP] as the IP address of the shard server. (We recommend that you configure the IP address. If you use hostname, an error may occur, causing the mongod process to fail ). Name indicates the name of the shard, which can be configured at will.
> DB. runcommand ({enablesharding: "dbname "});
Configure the database to allow sharding
> DB. runcommand ({shardcollection: "dbname. colname", key: {keyname: 1 }});
Configure the shard key of the collection
E. Common commands
> Use admin;
> DB. printsharingstatus ();
You can view the sharding status of the entire cluster.
> Use dbname;
> DB. colname. Stats ();
You can view the parts of a specific collection.

Replica set -- replica set

1. Introduction to replica set
Replica set is produced along with the MongoDB sharding technology. relpica set is the replication technology used by MongoDB In the shard environment. A set of replica set supports one to seven servers, and the data of each server in a replica Set remains completely consistent. Using relpica set, MongoDB implements automatic error handling and automatic error recovery.
In a replica set, each server has the following statuses:
A. Primary master node. One replica set has only one server in primary state, and only the master node provides read/write services externally. If the master node fails, the replica set will vote for a slave node to become the new master node.
B. The secondary slave node allows multiple secondary nodes. The data of each slave node is completely synchronized with that of the master node.
C. in recovering recovery, when a server in the replica set fails or is offline, the data cannot be synchronized. After the service is restored, the data is copied from other Members, and the data is in the recovery process. After the data is synchronized, the node returns to the STANDBY state.
D. arbiter arbitration node. This node does not need to exist independently. If it is configured as an arbitration node, it is mainly responsible for monitoring the status of other nodes in the replica set and voting to select the master node that has recovered data. This node will not be used to store data. If no arbitration node exists, the voting will be conducted by all nodes.
E. If a node fails or is offline, the node is in this status.

2. Configuration
Take the replica set with two nodes configured on a single machine as an Example
A. Start mongod.
./Bin/mongod -- replset setname -- Port 27017 -- dbpath./data/shard1 -- oplogsize 10000 -- logpath./data/set1.log -- logappend-Fork
./Bin/mongod -- replset setname -- Port 27018 -- dbpath./data/shard1 -- oplogsize 10000 -- logpath./data/set2.log -- logappend-Fork
Replset indicates the name of the replica set.
B. initialize the replica set
./Bin/Mongo localhost: 27017
Use Mongo to connect to any machine in the replica set
> Use admin;
> Config = {
_ ID: 'setname ',
Members :[
{_ ID: 0, host: '[IP]: 100 '},
{_ ID: 1, host: '[IP]: 808080 '}
]
};
> Rs. Initiate (config );
It mainly uses a Config object to initialize the replica set.
_ ID: 'setname' indicates the name of the replica set, which must be the same as the name set at startup.
Members specifies the IP addresses and ports of each server in the replica set.
The configuration is complete.
C. Common commands
> Use admin;
> Rs. Status ();
View the status of the replica set.
> Rs. Add ("[IP]: 27020 ");
Run to add a new member.
D. Replica set configuration in the shard
You can configure a set of replica sets for a shard by adding the -- replset setname option.
For example:
./Bin/mongod-shardsvr-replset shard1-port 27019 -- dbpath./data/shard3 $ hostid -- oplogsize 10000-logpath./data/shard3 $ hostid. log -- logappend-Fork
After all the machines in a replica set are started, use Mongo to log on to a machine to initialize the replica set. Note that the replica set is not logged on to mongos.
Use the following command to add a part.
> DB. runcommand ({addshard: "shard1/[IP1]: [port1], [ip2]: [port2]", name: "S1 "});
Use <replica setname>/<IP >:< port >,< IP >:< port> to add parts.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.