Three ways to build a MongoDB cluster

Source: Internet
Author: User
Tags mongodb version

MongoDB is a popular NoSQL database, it is stored in the form of document-type storage, not key-value. About the characteristics of MongoDB, here is not more introduced, we can go to see the official note: http://docs.mongodb.org/manual/

Today, the main thing to say about MongoDB three ways to build a cluster: Replica set/sharding/master-slaver. This is the simplest way to build a cluster (production environment), if you have multiple nodes, and so on, or view official documents. The OS is a ubuntu_x64 system, and the client uses a Java client. MongoDB version is mongodb-linux-x86_64-2.2.2.tgz

Replica Set

Chinese translation is called a copy set, but I do not like to translate English into Chinese, always feel strange. In fact, the simple is that the cluster contains a number of data, to ensure that the main node hangs, the standby node can continue to provide data services, provided the premise is that the data needs and the main node consistent. Such as:

MongoDB (M) represents the primary node, MongoDB (S) represents the standby node, and MongoDB (A) represents the quorum node. The primary and standby node stores data, and the quorum node does not store data. The client connects the primary node to the standby node at the same time and does not connect the quorum node.

By default, the master node provides all the additions and deletions, and the standby node does not provide any services. However, you can provide a query service by setting up the standby node, which can reduce the pressure on the primary node, and when the client makes a data query, the request is automatically forwarded to the standby node. This setting is called the Read Preference Modes, and the Java client provides a simple way to configure it without having to manipulate the database directly.

The quorum node is a special node that does not store data by itself, and the main purpose is to decide which standby node is promoted to the primary node after the master node is hung, so the client does not need to connect to this node. Although there is only one standby node, a quorum node is still required to raise the node level. I do not believe that there must be a quorum node, but I have tried not to quorum node, the master node hangs the standby node or the standby node, so we still need it.

After the introduction of the cluster scheme, we now start building.

1. Create a Data folder

In general, the data directory will not be established in the directory of MongoDB decompression, but here for convenience, it is built in the MongoDB decompression directory.

[Plain]View Plaincopy
    1. Mkdir-p/mongodb/data/master
    2. Mkdir-p/mongodb/data/slaver
    3. Mkdir-p/mongodb/data/arbiter
    4. #三个目录分别对应主, standby, quorum node

2. Create a configuration file

Because of the more configuration, we write the configuration to the file.

[Plain]View Plaincopy
    1. #master. conf
    2. Dbpath=/mongodb/data/master
    3. Logpath=/mongodb/log/master.log
    4. Pidfilepath=/mongodb/master.pid
    5. Directoryperdb=true
    6. Logappend=true
    7. Replset=testrs
    8. bind_ip=10.10.148.130
    9. port=27017
    10. oplogsize=10000
    11. Fork=true
    12. Noprealloc=true
[Plain]View Plaincopy
    1. #slaver. conf
    2. Dbpath=/mongodb/data/slaver
    3. Logpath=/mongodb/log/slaver.log
    4. Pidfilepath=/mongodb/slaver.pid
    5. Directoryperdb=true
    6. Logappend=true
    7. Replset=testrs
    8. bind_ip=10.10.148.131
    9. port=27017
    10. oplogsize=10000
    11. Fork=true
    12. Noprealloc=true
[Plain]View Plaincopy
    1. #arbiter. conf
    2. Dbpath=/mongodb/data/arbiter
    3. Logpath=/mongodb/log/arbiter.log
    4. Pidfilepath=/mongodb/arbiter.pid
    5. Directoryperdb=true
    6. Logappend=true
    7. Replset=testrs
    8. bind_ip=10.10.148.132
    9. port=27017
    10. oplogsize=10000
    11. Fork=true
    12. Noprealloc=true
Parameter explanation:

DBPath: Data Storage Directory

LogPath: Log Storage path

Pidfilepath: Process files, easy to stop MongoDB

Directoryperdb: Set up a folder for each database according to the database name

Logappend: Logging in Append mode

Replset:replica Set's name

IP address that is bound by Bind_ip:mongodb

The port number used by the PORT:MONGODB process, which defaults to 27017

Oplogsize:mongodb the maximum size of the operation log file. MB, default to 5% of the hard disk's remaining space

Fork: Run the process in the next stage mode

Noprealloc: No pre-allocated storage

3. Start MongoDB

Enter the bin directory for each MongoDB node

[Java]View Plaincopy
    1. ./monood-f master.conf
    2. ./mongod-f slaver.conf
    3. ./mongod-f arbiter.conf

Note the path to the configuration file must be guaranteed to be correct, either as a relative path or as an absolute path.

4. Configure master, standby, quorum node

You can connect MongoDB via client, or you can choose a connection to mongodb directly from three nodes.

[Plain]View Plaincopy
    1. ./mongo 10.10.148.130:27017 #ip和port是某个节点的地址
    2. >use Admin
    3. >cfg={_id: "Testrs", members:[{_id:0,host: ' 10.10.148.130:27017 ', priority:2}, {_id:1,host: ' 10.10.148.131:27017 ' , priority:1},
    4. {_id:2,host: ' 10.10.148.132:27017 ', arbiteronly:true}] };
    5. >rs.initiate (CFG) #使配置生效
CFG can be arbitrary name, of course, it is best not to be mongodb keyword, conf,config can. The outermost _id represents the name of the replica set, and the members contain the address and priority of all nodes. The highest priority becomes the master node, the 10.10.148.130:27017 here. It is particularly important to note that there is a special configuration--arbiteronly:true for the quorum node. This must not be less, or the main standby mode will not be effective.

The configuration of the effective time according to the different machine configuration will be long and short, the configuration is good, basically in more than 10 seconds can be effective, and some configuration takes two minutes. If it takes effect, the Execute rs.status () command will see the following information:

[Plain]View Plaincopy
  1. {
  2. "Set": "Testrs",
  3. "Date": Isodate ("2013-01-05t02:44:43z"),
  4. "MyState": 1,
  5. "Members": [
  6. {
  7. "_id": 0,
  8. "Name": "10.10.148.130:27017",
  9. "Health": 1,
  10. "State": 1,
  11. "Statestr": "PRIMARY",
  12. "Uptime": 200,
  13. "Optime": Timestamp (1357285565000, 1),
  14. "Optimedate": Isodate ("2013-01-04t07:46:05z"),
  15. "Self": true
  16. },
  17. {
  18. "_id": 1,
  19. "Name": "10.10.148.131:27017",
  20. "Health": 1,
  21. "State": 2,
  22. "Statestr": "Secondary",
  23. "Uptime": 200,
  24. "Optime": Timestamp (1357285565000, 1),
  25. "Optimedate": Isodate ("2013-01-04t07:46:05z"),
  26. "Lastheartbeat": Isodate ("2013-01-05t02:44:42z"),
  27. "Pingms": 0
  28. },
  29. {
  30. "_id": 2,
  31. "Name": "10.10.148.132:27017",
  32. "Health": 1,
  33. "State": 7,
  34. "Statestr": "Arbiter",
  35. "Uptime": 200,
  36. "Lastheartbeat": Isodate ("2013-01-05t02:44:42z"),
  37. "Pingms": 0
  38. }
  39. ],
  40. "OK": 1
  41. }
If the configuration is in effect, it contains the following information:

[Plain]View Plaincopy
    1. "Statestr": "RECOVERING"

You can also view the log of the corresponding node and find that the data file is being queued for another node to take effect or is being allocated.

Now we have basically completed all the building work of the cluster. As for the test work, you can leave it to everyone to try. One is to the main node to insert data, can be from the backup node to find the previously inserted data (query standby node may encounter a problem, you can check online). The second is to stop the main node, the standby node can become the main node to provide services. The third is to restore the master node, the standby node can also restore its role, rather than continue to act as the main role. Two and San du can see the changes in the cluster in real time through the rs.status () command.

Sharding

Similar to replica set, a quorum node is required, but sharding also needs to configure nodes and routing nodes. This is the most complex of the three ways to build a cluster. The deployment diagram is as follows:

1. Start the Data node

[Plain]View Plaincopy
    1. ./mongod--fork--dbpath.. /data/set1/--logpath. /log/set1.log--replset test #192.168.4.43
    2. ./mongod--fork--dbpath.. /data/set2/--logpath. /log/set2.log--replset test #192.168.4.44
    3. ./mongod--fork--dbpath.. /data/set3/--logpath. /log/set3.log--replset Test #192.168.4.45 decision does not store data

2. Start the Configuration node

[Plain]View Plaincopy
    1. ./mongod--configsvr--dbpath.. /config/set1/--port 20001--fork--logpath. /log/conf1.log #192.168.4.30
    2. ./mongod--configsvr--dbpath.. /config/set2/--port 20002--fork--logpath. /log/conf2.log #192.168.4.31

3. Start the Routing node

[Plain]View Plaincopy
    1. ./mongos--configdb 192.168.4.30:20001,192.168.4.31:20002--port 27017--fork--logpath. /log/root.log #192.168.4.29

Here we do not start with the configuration file, the parameter meaning of which everyone should understand. In general, a data node corresponds to a configuration node, and the quorum node does not require a corresponding configuration node. Note When you start the routing node, you write the configuration node address to the startup command.

4. Configure Replica Set

It might be a little strange here why sharding would need to configure replica Set. In fact, it can be understood that the data of multiple nodes is definitely related, if not a replica Set, how to identify the same cluster? This is also the provisions of MongoDB, we still abide by it. Configure a CFG and initialize the configuration as previously stated.

[Plain]View Plaincopy
    1. ./mongo 192.168.4.43:27017 #ip和port是某个节点的地址
    2. >use Admin
    3. >cfg={_id: "Testrs", members:[{_id:0,host: ' 192.168.4.43:27017 ', priority:2}, {_id:1,host: ' 192.168.4.44:27017 ', Priority:1},
    4. {_id:2,host: ' 192.168.4.45:27017 ', arbiteronly:true}] };
    5. >rs.initiate (CFG) #使配置生效

5. Configure Sharding

[Plain]View Plaincopy
    1. ./mongo 192.168.4.29:27017 #这里必须连接路由节点
    2. >sh.addshard ("test/192.168.4.43:27017") #test表示replica set name when the master node is added to Shard, the master, standby, and decision nodes in the set are automatically found
    3. >db.runcommand ({enablesharding: "Diameter_test"}) #diameter_test is database name
    4. >db.runcommand ({shardcollection: "Diameter_test.dcca_dccr_test", key:{"__avpsessionid": 1}})
The first command is easy to understand, the second command is to configure the database that needs to be sharding, and the third command is to configure the collection that needs to be sharding, where Dcca_dccr_test is the name of collection. There is also a key, this is a more critical thing, for the query efficiency will have a great impact, you can see Shard key overview

Here Sharding has been built, the above is just the simplest way to build, some of the configuration is still using the default configuration. If set incorrectly, will lead to inefficient, so we recommend that you look at the official documents and then make the default configuration changes.

Master-slaver

This is the simplest answer to the cluster building, but the exact said can not be considered a cluster, can only be said to be the main preparation. And the official has not recommended this way, so here is just a simple introduction, the construction method is relatively simple.

[Plain]View Plaincopy
    1. ./mongod--master--dbpath/data/masterdb/#主节点
    2. ./mongod--slave--source <masterip:masterport>--dbpath/data/slavedb/standby node
Basically as long as the main node and the standby node to execute the two commands, master-slaver even if the build is complete. I have not tried to see if the primary node can become the primary node, but since it has not been recommended, we do not need to use it.

Above three kinds of cluster construction mode is preferred replica Set, only really big data, sharding can show the power, after all, it takes time to synchronize data with the node. Sharding can compare multiple pieces of data to a routing node and then return the data to the client, but the efficiency is still relatively low.

I have tested it myself, but I don't remember the exact machine configuration. Replica set of IPs in the data reached 1400w, the basic can reach about 1000, and sharding in 300w has dropped to 500ips, the unit data size is probably 10kb. Everyone in the application time or a lot of performance testing, after all, unlike Redis has benchmark.

MongoDB is still a lot of use, but the individual feel that there are too many configurations .... I crossing the net to see a lot of days before the cluster configuration and attention points to understand. and used people should know MongoDB eat memory problem, solution can only through Ulimit to control memory usage, but if control is not good, MongoDB will hang off ...

After a while I will write an article about the project to use MongoDB involved in the specific business, you are interested to pay attention to.

Three ways to build a MongoDB cluster

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.