MongoDB 3.4 Cluster Build: Shard + Replica set

Source: Internet
Author: User
Tags install mongodb



MongoDB is the most commonly used NODQL database and has risen to the top six in the database rankings. This article describes how to build a highly available MongoDB (shard + replica) cluster.



Before you build a cluster, you need to understand several concepts: routing, sharding, replica set, configuration server, and so on.


Related concepts


Let's take a look at a picture:






You can see that there are four components: MONGOs, config server, shard, replica set.



MONGOs, the entrance to the database cluster request, all requests are coordinated through MONGOs, do not need to add a route selector in the application, MONGOs itself is a request distribution center, it is responsible for the corresponding data request request forwarded to the corresponding Shard server. In a production environment there is usually more mongos as the entrance to the request, preventing one of the other MongoDB requests from being hung out of operation.



Config server, as its name implies, configures the configuration of servers, storing all database meta-information (routing, sharding). The mongos itself does not have a physical storage Shard server and data routing information, but is cached in memory, and the configuration server actually stores the data. MONGOs the first boot or shutdown reboot will load configuration information from config server, and if configuration server information changes will notify all MONGOs to update their status, so that MONGOs can continue to route accurately. In a production environment, there are typically multiple config server configuration servers because it stores metadata for fragmented routes to prevent data loss!



Shard, sharding (sharding) refers to the process of splitting a database and dispersing it across different machines. Spread the data across different machines without the need for powerful servers to store more data and handle larger loads. The basic idea is to cut the set into small pieces, which are scattered across several slices, each of which is responsible for only a portion of the total data, and finally a equalizer to equalize each shard (data migration).



Replica set, the Chinese translation copy set, is actually the Shard backup, prevents the data loss after the Shard hangs. Replication provides redundant backups of data, stores copies of data on multiple servers, improves data availability, and guarantees data security.



The Arbiter (arbiter), which is a MongoDB instance in the replication set, does not hold data. The quorum node uses minimal resources and does not require hardware devices, cannot deploy arbiter in the same dataset node, can be deployed on other application servers or monitoring servers, or can be deployed in a separate virtual machine. To ensure that there are an odd number of voting members (including primary) in the replication set, the quorum node needs to be added as a vote, otherwise primary will not automatically switch primary when it is not running.



After a simple understanding, we can summarize the application request MONGOs to manipulate MongoDB additions and deletions, configure the server storage database meta-information, and MONGOs do synchronization, the data is finally deposited on The Shard (Shard), in order to prevent data loss synchronization in the replica set stored in a copy, The quorum determines which node to store when the data is stored to the Shard.


Environment preparation


System System centos6.5



Three servers: 192.168.0.75/84/86



Installation package: Mongodb-linux-x86_64-3.4.6.tgz



Server Planning


Server Server Server
MONGOs MONGOs MONGOs
Config server Config server Config server
Shard Server1 Master Node Shard Server1 Sub-node Shard Server1 Arbitration
Shard Server2 Arbitration Shard Server2 Master Node Shard Server2 Sub-node
Shard Server3 Sub-node Shard Server3 Arbitration Shard Server3 Master Node


Port assignment:

mongos: 20000
config: 21000
shard1: 27001
shard2: 27002
shard3: 27003
Cluster setup 1.Install mongodb
#Unzip
tar -xzvf mongodb-linux-x86_64-3.4.6.tgz -C / usr / local /
# 改名
mv mongodb-linux-x86_64-3.4.6 mongodb
Create six directories of conf, mongos, config, shard1, shard2, and shard3 on each machine. Because mongos does not store data, you only need to create a log file directory.

mkdir -p / usr / local / mongodb / conf
mkdir -p / usr / local / mongodb / mongos / log
mkdir -p / usr / local / mongodb / config / data
mkdir -p / usr / local / mongodb / config / log
mkdir -p / usr / local / mongodb / shard1 / data
mkdir -p / usr / local / mongodb / shard1 / log
mkdir -p / usr / local / mongodb / shard2 / data
mkdir -p / usr / local / mongodb / shard2 / log
mkdir -p / usr / local / mongodb / shard3 / data
mkdir -p / usr / local / mongodb / shard3 / log
Configure environment variables

vim / etc / profile
# Content
export MONGODB_HOME = / usr / local / mongodb
export PATH = $ MONGODB_HOME / bin: $ PATH
# Make it effective immediately
source / etc / profile
Config server
After mongodb 3.4, the configuration server is required to create a replica set, otherwise the cluster setup is unsuccessful.

Add profile

vi /usr/local/mongodb/conf/config.conf

## Configuration file content
pidfilepath = /usr/local/mongodb/config/log/configsrv.pid
dbpath = / usr / local / mongodb / config / data
logpath = /usr/local/mongodb/config/log/congigsrv.log
logappend = true
 
bind_ip = 0.0.0.0
port = 21000
fork = true
 
#declare this is a config db of a cluster;
configsvr = true

#Copy Set Name
replSet = configs
 
#Set the maximum number of connections
maxConns = 20000
Start the config server of three servers

mongod -f /usr/local/mongodb/conf/config.conf
Log in to any configuration server and initialize the configuration replica set

#connection
mongo --port 21000
#config variables
config = {
... _id: "configs",
... members: [
... {_id: 0, host: "192.168.0.75:21000"},
... {_id: 1, host: "192.168.0.84:21000"},
... {_id: 2, host: "192.168.0.86:21000"}
...]
...}

#Initialize the replica set
rs.initiate (config)
Among them, "_id": "configs" should be consistent with replicaction.replSetName configured in the configuration file, and "host" in "members" is the ip and port of the three nodes

3.Configure the shard replica set (three machines) Set the first shard replica set
Configuration file

vi /usr/local/mongodb/conf/shard1.conf

#Configuration file content
# ——————————————–
pidfilepath = /usr/local/mongodb/shard1/log/shard1.pid
dbpath = / usr / local / mongodb / shard1 / data
logpath = /usr/local/mongodb/shard1/log/shard1.log
logappend = true

bind_ip = 0.0.0.0
port = 27001
fork = true
 
#Open web monitoring
httpinterface = true
rest = true
 
#Copy Set Name
replSet = shard1
 
#declare this is a shard db of a cluster;
shardsvr = true
 
#Set the maximum number of connections
maxConns = 20000
Start shard1 server of three servers

mongod -f /usr/local/mongodb/conf/shard1.conf
Log in to any server and initialize the replica set

mongo --port 27001
#Using the admin database
use admin
#Define the replica set configuration. "ArbiterOnly": true for the third node means it is the quorum node.
config = {
... _id: "shard1",
... members: [
... {_id: 0, host: "192.168.0.75:27001"},
... {_id: 1, host: "192.168.0.84:27001"},
... {_id: 2, host: "192.168.0.86:27001", arbiterOnly: true}
...]
...}
#Initialize the replica set configuration
rs.initiate (config);
Set up a second shard replica set
Configuration file

vi /usr/local/mongodb/conf/shard2.conf

#Configuration file content
# ——————————————–
pidfilepath = /usr/local/mongodb/shard2/log/shard2.pid
dbpath = / usr / local / mongodb / shard2 / data
logpath = /usr/local/mongodb/shard2/log/shard2.log
logappend = true

bind_ip = 0.0.0.0
port = 27002
fork = true
 
#Open web monitoring
httpinterface = true
rest = true
 
#Copy Set Name
replSet = shard2
 
#declare this is a shard db of a cluster;
shardsvr = true
 
#Set the maximum number of connections
maxConns = 20000
Start shard2 server of three servers

mongod -f /usr/local/mongodb/conf/shard2.conf
Log in to any server and initialize the replica set

mongo --port 27002
#Using the admin database
use admin
#Define replica set configuration
config = {
... _id: "shard2",
... members: [
... {_id: 0, host: "192.168.0.75:27002", arbiterOnly: true},
... {_id: 1, host: "192.168.0.84:27002"},
... {_id: 2, host: "192.168.0.86:27002"}
...]
...}

#Initialize the replica set configuration
rs.initiate (config);
Set up a third shard replica set
Configuration file

vi /usr/local/mongodb/conf/shard3.conf

#Configuration file content
# ——————————————–
pidfilepath = /usr/local/mongodb/shard3/log/shard3.pid
dbpath = / usr / local / mongodb / shard3 / data
logpath = /usr/local/mongodb/shard3/log/shard3.log
logappend = true

bind_ip = 0.0.0.0
port = 27003
fork = true
 
#Open web monitoring
httpinterface = true
rest = true
 
#Copy Set Name
replSet = shard3
 
#declare this is a shard db of a cluster;
shardsvr = true
 
#Set the maximum number of connections
maxConns = 20000
Start shard3 server of three servers

mongod -f /usr/local/mongodb/conf/shard3.conf
Log in to any server and initialize the replica set

mongo --port 27003
#Using the admin database
use admin
#Define replica set configuration
config = {
... _id: "shard3",
... members: [
... {_id: 0, host: "192.168.0.75:27003"},
... {_id: 1, host: "192.168.0.84:27003", arbiterOnly: true},
... {_id: 2, host: "192.168.0.86:27003"}
...]
...}

#Initialize the replica set configuration
rs.initiate (config);
4.Configure routing server mongos
Start the configuration server and the shard server first, then start the routing instance to start the routing instance: (three machines)

vi /usr/local/mongodb/conf/mongos.conf

#content
pidfilepath = /usr/local/mongodb/mongos/log/mongos.pid
logpath = /usr/local/mongodb/mongos/log/mongos.log
logappend = true

bind_ip = 0.0.0.0
port = 20000
fork = true

#The configuration server to monitor, there can only be 1 or 3 configs for the replica set name of the configuration server
configdb = configs / 192.168.0.75: 21000,192.168.0.84: 21000,192.168.0.86: 21000
 
#Set the maximum number of connections
maxConns = 20000
Start the three server mongos server

mongod -f /usr/local/mongodb/conf/mongos.conf
5.Enable sharding
Currently, a mongodb configuration server, a routing server, and various shard servers are set up. However, the application cannot connect to the mongos routing server using the sharding mechanism. You need to set the sharding configuration in the program for the sharding to take effect.

Login to any mongos

mongo --port 20000
#Using the admin database
user admin
#Concatenated routing server and distribution replica set
sh.addShard ("shard1 / 192.168.0.75: 27001,192.168.0.84: 27001,192.168.0.86: 27001 ")
sh.addShard ("shard2 / 192.168.0.75: 27002,192.168.0.84: 27002,192.168.0.86: 27002")
sh.addShard ("shard3 / 192.168.0.75: 27003,192.168.0.84: 27003,192.168.0.86: 27003")
#View cluster status
sh.status ()
6.Test
Currently, the configuration service, routing service, sharding service, and replica set service are all connected in series, but our goal is to insert data and the data can be automatically sharded. Connect to mongos and prepare to make the specified database and the specified collection shard take effect.

#Specify that testdb sharding takes effect
db.runCommand ({enablesharding: "testdb"});
#Specify the collection and fragment key that need to be fragmented in the database
db.runCommand ({shardcollection: "testdb.table1", key: {id: 1}})
We set the table1 table of testdb to be sharded and automatically sharded to shard1, shard2, and shard3 based on id. This is because not all mongodb databases and tables need to be fragmented!

Test shard configuration results

mongo 127.0.0.1:20000
#Use testdb
use testdb;
#Insert test data
for (var i = 1; i <= 100000; i ++)
db.table1.save ({id: i, "test1": "testval1"});
#View the sharding situation as follows, some irrelevant information is omitted
db.table1.stats ();

{
        "sharded": true,
        "ns": "testdb.table1",
        "count": 100000,
        "numExtents": 13,
        "size": 5600000,
        "storageSize": 22372352,
        "totalIndexSize": 6213760,
        "indexSizes": {
                "_id_": 3335808,
                "id_1": 2877952
        },
        "avgObjSize": 56,
        "nindexes": 2,
        "nchunks": 3,
        "shards": {
                "shard1": {
                        "ns": "testdb.table1",
                        "count": 42183,
                        "size": 0,
                        ...
                        "ok": 1
                },
                "shard2": {
                        "ns": "testdb.table1",
                        "count": 38937,
                        "size": 2180472,
                        ...
                        "ok": 1
                },
                "shard3": {
                        "ns": "testdb.table1",
                        "count": 18880,
                        "size": 3419528,
                        ...
                        "ok": 1
                }
        },
        "ok": 1
}
You can see that the data is divided into 3 shards, and the number of each shard is: shard1 "count": 42183, shard2 "count": 38937, and shard3 "count": 18880. Already successful!

Late operation and maintenance
The startup sequence of mongodb is to start the configuration server first, start the shards, and finally start mongos.

mongod -f /usr/local/mongodb/conf/config.conf
mongod -f /usr/local/mongodb/conf/shard1.conf
mongod -f /usr/local/mongodb/conf/shard2.conf
mongod -f /usr/local/mongodb/conf/shard3.conf
mongod -f /usr/local/mongodb/conf/mongos.conf
When closing, killall all processes directly

killall mongod
killall mongos
mongodb 3.4 cluster setup: shard + replica set

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.