Three ways to build a MongoDB cluster

Last Update:2015-04-15 Source: Internet

Author: User

Tags mongodb version

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MongoDB is a popular NoSQL database, it is stored in the form of document-type storage, not key-value. About the characteristics of MongoDB, here is not more introduced, we can go to see the official note: http://docs.mongodb.org/manual/

Today, the main thing to say about MongoDB three ways to build a cluster: Replica set/sharding/master-slaver. This is the simplest way to build a cluster (production environment), if you have multiple nodes, and so on, or view official documents. The OS is a ubuntu_x64 system, and the client uses a Java client. MongoDB version is mongodb-linux-x86_64-2.2.2.tgz

Replica Set

Chinese translation is called a copy set, but I do not like to translate English into Chinese, always feel strange. In fact, the simple is that the cluster contains a number of data, to ensure that the main node hangs, the standby node can continue to provide data services, provided the premise is that the data needs and the main node consistent. Such as:

MongoDB (M) represents the primary node, MongoDB (S) represents the standby node, and MongoDB (A) represents the quorum node. The primary and standby node stores data, and the quorum node does not store data. The client connects the primary node to the standby node at the same time and does not connect the quorum node.

By default, the master node provides all the additions and deletions, and the standby node does not provide any services. However, you can provide a query service by setting up the standby node, which can reduce the pressure on the primary node, and when the client makes a data query, the request is automatically forwarded to the standby node. This setting is called the Read Preference Modes, and the Java client provides a simple way to configure it without having to manipulate the database directly.

The quorum node is a special node that does not store data by itself, and the main purpose is to decide which standby node is promoted to the primary node after the master node is hung, so the client does not need to connect to this node. Although there is only one standby node, a quorum node is still required to raise the node level. I do not believe that there must be a quorum node, but I have tried not to quorum node, the master node hangs the standby node or the standby node, so we still need it.

After the introduction of the cluster scheme, we now start building.

1. Create a Data folder

In general, the data directory will not be established in the directory of MongoDB decompression, but here for convenience, it is built in the MongoDB decompression directory.

[Plain]View Plaincopy

Mkdir-p/mongodb/data/master
Mkdir-p/mongodb/data/slaver
Mkdir-p/mongodb/data/arbiter
#三个目录分别对应主, standby, quorum node

2. Create a configuration file

Because of the more configuration, we write the configuration to the file.

[Plain]View Plaincopy

#master. conf
Dbpath=/mongodb/data/master
Logpath=/mongodb/log/master.log
Pidfilepath=/mongodb/master.pid
Directoryperdb=true
Logappend=true
Replset=testrs
bind_ip=10.10.148.130
port=27017
oplogsize=10000
Fork=true
Noprealloc=true

[Plain]View Plaincopy

#slaver. conf
Dbpath=/mongodb/data/slaver
Logpath=/mongodb/log/slaver.log
Pidfilepath=/mongodb/slaver.pid
Directoryperdb=true
Logappend=true
Replset=testrs
bind_ip=10.10.148.131
port=27017
oplogsize=10000
Fork=true
Noprealloc=true

[Plain]View Plaincopy

#arbiter. conf
Dbpath=/mongodb/data/arbiter
Logpath=/mongodb/log/arbiter.log
Pidfilepath=/mongodb/arbiter.pid
Directoryperdb=true
Logappend=true
Replset=testrs
bind_ip=10.10.148.132
port=27017
oplogsize=10000
Fork=true
Noprealloc=true

Parameter explanation:

DBPath: Data Storage Directory

LogPath: Log Storage path

Pidfilepath: Process files, easy to stop MongoDB

Directoryperdb: Set up a folder for each database according to the database name

Logappend: Logging in Append mode

Replset:replica Set's name

IP address that is bound by Bind_ip:mongodb

The port number used by the PORT:MONGODB process, which defaults to 27017

Oplogsize:mongodb the maximum size of the operation log file. MB, default to 5% of the hard disk's remaining space

Fork: Run the process in the next stage mode

Noprealloc: No pre-allocated storage

3. Start MongoDB

Enter the bin directory for each MongoDB node

[Java]View Plaincopy

./monood-f master.conf
./mongod-f slaver.conf
./mongod-f arbiter.conf

Note the path to the configuration file must be guaranteed to be correct, either as a relative path or as an absolute path.

4. Configure master, standby, quorum node

You can connect MongoDB via client, or you can choose a connection to mongodb directly from three nodes.

[Plain]View Plaincopy

./mongo 10.10.148.130:27017 #ip和port是某个节点的地址
>use Admin
>cfg={_id: "Testrs", members:[{_id:0,host: ' 10.10.148.130:27017 ', priority:2}, {_id:1,host: ' 10.10.148.131:27017 ' , priority:1},
{_id:2,host: ' 10.10.148.132:27017 ', arbiteronly:true}] };
>rs.initiate (CFG) #使配置生效

CFG can be arbitrary name, of course, it is best not to be mongodb keyword, conf,config can. The outermost _id represents the name of the replica set, and the members contain the address and priority of all nodes. The highest priority becomes the master node, the 10.10.148.130:27017 here. It is particularly important to note that there is a special configuration--arbiteronly:true for the quorum node. This must not be less, or the main standby mode will not be effective.

The configuration of the effective time according to the different machine configuration will be long and short, the configuration is good, basically in more than 10 seconds can be effective, and some configuration takes two minutes. If it takes effect, the Execute rs.status () command will see the following information:

[Plain]View Plaincopy

{
"Set": "Testrs",
"Date": Isodate ("2013-01-05t02:44:43z"),
"MyState": 1,
"Members": [
{
"_id": 0,
"Name": "10.10.148.130:27017",
"Health": 1,
"State": 1,
"Statestr": "PRIMARY",
"Uptime": 200,
"Optime": Timestamp (1357285565000, 1),
"Optimedate": Isodate ("2013-01-04t07:46:05z"),
"Self": true
},
{
"_id": 1,
"Name": "10.10.148.131:27017",
"Health": 1,
"State": 2,
"Statestr": "Secondary",
"Uptime": 200,
"Optime": Timestamp (1357285565000, 1),
"Optimedate": Isodate ("2013-01-04t07:46:05z"),
"Lastheartbeat": Isodate ("2013-01-05t02:44:42z"),
"Pingms": 0
},
{
"_id": 2,
"Name": "10.10.148.132:27017",
"Health": 1,
"State": 7,
"Statestr": "Arbiter",
"Uptime": 200,
"Lastheartbeat": Isodate ("2013-01-05t02:44:42z"),
"Pingms": 0
}
],
"OK": 1
}

If the configuration is in effect, it contains the following information:

[Plain]View Plaincopy

"Statestr": "RECOVERING"

You can also view the log of the corresponding node and find that the data file is being queued for another node to take effect or is being allocated.

Now we have basically completed all the building work of the cluster. As for the test work, you can leave it to everyone to try. One is to the main node to insert data, can be from the backup node to find the previously inserted data (query standby node may encounter a problem, you can check online). The second is to stop the main node, the standby node can become the main node to provide services. The third is to restore the master node, the standby node can also restore its role, rather than continue to act as the main role. Two and San du can see the changes in the cluster in real time through the rs.status () command.

Sharding

Similar to replica set, a quorum node is required, but sharding also needs to configure nodes and routing nodes. This is the most complex of the three ways to build a cluster. The deployment diagram is as follows:

1. Start the Data node

[Plain]View Plaincopy

./mongod--fork--dbpath.. /data/set1/--logpath. /log/set1.log--replset test #192.168.4.43
./mongod--fork--dbpath.. /data/set2/--logpath. /log/set2.log--replset test #192.168.4.44
./mongod--fork--dbpath.. /data/set3/--logpath. /log/set3.log--replset Test #192.168.4.45 decision does not store data

2. Start the Configuration node

[Plain]View Plaincopy

./mongod--configsvr--dbpath.. /config/set1/--port 20001--fork--logpath. /log/conf1.log #192.168.4.30
./mongod--configsvr--dbpath.. /config/set2/--port 20002--fork--logpath. /log/conf2.log #192.168.4.31

3. Start the Routing node

[Plain]View Plaincopy

./mongos--configdb 192.168.4.30:20001,192.168.4.31:20002--port 27017--fork--logpath. /log/root.log #192.168.4.29

Here we do not start with the configuration file, the parameter meaning of which everyone should understand. In general, a data node corresponds to a configuration node, and the quorum node does not require a corresponding configuration node. Note When you start the routing node, you write the configuration node address to the startup command.

4. Configure Replica Set

It might be a little strange here why sharding would need to configure replica Set. In fact, it can be understood that the data of multiple nodes is definitely related, if not a replica Set, how to identify the same cluster? This is also the provisions of MongoDB, we still abide by it. Configure a CFG and initialize the configuration as previously stated.

[Plain]View Plaincopy

./mongo 192.168.4.43:27017 #ip和port是某个节点的地址
>use Admin
>cfg={_id: "Testrs", members:[{_id:0,host: ' 192.168.4.43:27017 ', priority:2}, {_id:1,host: ' 192.168.4.44:27017 ', Priority:1},
{_id:2,host: ' 192.168.4.45:27017 ', arbiteronly:true}] };
>rs.initiate (CFG) #使配置生效

5. Configure Sharding

[Plain]View Plaincopy

./mongo 192.168.4.29:27017 #这里必须连接路由节点
>sh.addshard ("test/192.168.4.43:27017") #test表示replica set name when the master node is added to Shard, the master, standby, and decision nodes in the set are automatically found
>db.runcommand ({enablesharding: "Diameter_test"}) #diameter_test is database name
>db.runcommand ({shardcollection: "Diameter_test.dcca_dccr_test", key:{"__avpsessionid": 1}})

The first command is easy to understand, the second command is to configure the database that needs to be sharding, and the third command is to configure the collection that needs to be sharding, where Dcca_dccr_test is the name of collection. There is also a key, this is a more critical thing, for the query efficiency will have a great impact, you can see Shard key overview

Here Sharding has been built, the above is just the simplest way to build, some of the configuration is still using the default configuration. If set incorrectly, will lead to inefficient, so we recommend that you look at the official documents and then make the default configuration changes.

Master-slaver

This is the simplest answer to the cluster building, but the exact said can not be considered a cluster, can only be said to be the main preparation. And the official has not recommended this way, so here is just a simple introduction, the construction method is relatively simple.

[Plain]View Plaincopy

./mongod--master--dbpath/data/masterdb/#主节点
./mongod--slave--source <masterip:masterport>--dbpath/data/slavedb/standby node

Basically as long as the main node and the standby node to execute the two commands, master-slaver even if the build is complete. I have not tried to see if the primary node can become the primary node, but since it has not been recommended, we do not need to use it.

Above three kinds of cluster construction mode is preferred replica Set, only really big data, sharding can show the power, after all, it takes time to synchronize data with the node. Sharding can compare multiple pieces of data to a routing node and then return the data to the client, but the efficiency is still relatively low.

I have tested it myself, but I don't remember the exact machine configuration. Replica set of IPs in the data reached 1400w, the basic can reach about 1000, and sharding in 300w has dropped to 500ips, the unit data size is probably 10kb. Everyone in the application time or a lot of performance testing, after all, unlike Redis has benchmark.

MongoDB is still a lot of use, but the individual feel that there are too many configurations .... I crossing the net to see a lot of days before the cluster configuration and attention points to understand. and used people should know MongoDB eat memory problem, solution can only through Ulimit to control memory usage, but if control is not good, MongoDB will hang off ...

After a while I will write an article about the project to use MongoDB involved in the specific business, you are interested to pay attention to.

Three ways to build a MongoDB cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More