MongoDB status Monitoring, backup replication, and automatic sharding

Source: Internet
Author: User
Tags documentation failover mongodump mongo shell

If MongoDB is just a document-type database, there is no bright spot, however MongoDB's greatest advantage is read-extended, hot-backup, failback, and Auto-sharding (write extension). This section of the series to the end of the introduction of these features.

Backup copy realizes the database backup simultaneously, realizes the reading and writing separation, and realizes the load balance of the read operation, namely one main write server, multiple slave backup and read server, and supports the backup and read cluster expansion. The replica sets mode also supports failover, when the primary server down will vote for a slave server to replace the primary server to implement the write operation. The automatic sharding function will automatically fragment the original collection (table) to other servers, and realize distributed storage, that is, to alleviate the large amount of single-table data, but also to achieve the load balance of the write operation.

    1. Status monitoring
      • Http Console
      • MONGO Shell Diagnostic Commands
    2. Backup replication
      • Backup/restore
      • Master-slave
      • Replica Sets
    3. Automatic sharding
      • Auto sharding
1. Status Monitoring

First, respectively in the MongoDB directory to create data, Data\dbs (storage database file directory), Data\dbs\master (master server directory), Data\dbs\slave (from the server directory).

Create a MongoDB database service with the default port first

Looking at the last two lines, the MongoDB service created by default listens on 27017 ports, while 28017 (listening port +1000) is the Web admin interface listening port, and this 28017 is the HTTP console monitoring port.

    • Http Console

Then visit this http://localhost:28017/address to see, to show the effect, first add a few records to the default data

Run Mongo.exe (the default connection is the test library, which has a test set foo)

Access the HTTP console (http://localhost:28017/) to view monitoring results

Note where the red flag is, the first mark in the back of the backup copy will be interpreted, followed by the operation log.

    • MONGO Shell Diagnostic Commands

Then, check the server status by MONGO shell script.

These are two simple ways to monitor, and of course there are more ways to refer to the official Documentation section: Monitoring and diagnostics

2. Backup replication

Regardless of the database will consider the data backup replication, failover and so on. When some database servers read and write high, we also consider the implementation of these database server load balancing functions. Let's take a look at how MongoDB implements these features.

    • Backup/restore

When creating a MongoDB service, specifying directories via--dbpath is the MONGDB database file directory, which we can use to replicate the database, but this is not a safe way. Therefore, to shut down the server before the cold standby, this is described in the first section of the command to smooth shut down the server.

>use Admin

>db.shutdownserver ()

Alternatively, you can make MongoDB write data to the cache in a Fsync way, and then copy the backup

>use Admin

>db.runcommand ({"Fsync": 1, "Lock": 1})

This time I inserted a data into the f:6, after executing (), did not find this record, the record is not directly written to the database, but is buffered into the cache.

After the backup, to unlock (to prevent this time of power outage or other reasons, resulting in the loss of data in the cache)

>use Admin

>db. $cmd. Sys.unlock.findOne ()

>db.currentop () If Currentop only returns {"InProg": []} result, the unlock succeeded

After unlocking, the data in the cache is written to the database file, and we go to query foo results

The above is a cold standby method, we can use the two tools provided by MongoDB to achieve backup and recovery without stopping the service. This two tool can be seen in the bin directory of MongoDB


The principle of mongodump.exe backup is to get the current server snapshot through a single query and write the snapshot to disk, so this method is not saved in real time, because after the snapshot is taken, the server will also have data write, in order to ensure the security of the backup, we can also use the Fsync lock to temporarily write server data to the cache.

Backup command:

... bin>mongodump-d test-o backup//(backup is the back up directory, created to the bin directory by default)

Restore command: (can be restored to Foo table insert a record G:7)

..... bin>mongorestore-d Test--drop backup/test/

Look at the results of the operation:

The above is the backup and recovery process for MongoDB. MongoDB also provides a command to repair a data file when there is a problem or corruption in the database file

Modify through--repair When you start the Mongod service

...... bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\master"--repair

In addition, we can also repair the running data inventory in the MONGO shell

>use Test

>db.repairdatabase ()

Detailed details can be referred to the official website this section: Backups

We're looking at the other two types of read-extended backup mechanisms

    • Master-slave

Master-slave replication mode: That is, a primary write server, multiple from the backup server. From the server can realize backup, and read the extension, share the main server read dense time pressure, as a query server. However, when the primary server fails, we can only manually switch the backup server to take over the primary server. This flexible approach makes it relatively easy to extend a backup or query server, and of course the query server is not infinitely extensible, as these are periodically polled from the server to read updates from the primary server, which can overload the primary server when it is too large from the server.

We have created a port of 27017 as the primary server, and then created a port of 27018 from the server

Restart 27017 Primary server--master master server

...... bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\master"--master

Create 27018 to specify the primary server from the server--slave from the server--source

Bin>mongod--port 27018--dbpath "C:\Program files\mongodb\data\dbs\slave27018"--slave--source localhost : 27017

The master server can view the list of servers from the server through the slave collection of its own local library

From the server, you can view the primary server information or maintain multiple primary servers through the source collection of your local library. (one slave server can serve multiple master servers)

Or we can view the status via HTTP console

Refer to the official website: Master Slave

    • Replica Sets

Replica set mode: Has all the features of the master-slave mode, but the replica set does not have a fixed primary server, and when initialized, a primary server is elected through multiple servers. When the primary server fails, it polls the new primary server again by voting, and the original primary server reverts to the slave server. The mechanism of automatic switching of Replica sets in the event of a failure can guarantee write operations at very high time.

Create multiple replica set node--replset (note to be case-sensitive, the official recommended namespace uses an IP address)

..... bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\replset27017"--port 27017--replset replset/ bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\replset27018"--port 27018--replSet replset/

..... bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\replset27019"--port 27019--replset replset/

The first 3 are created in order to vote will not conflict, when the server is an even number may cause the main server can not be elected normally.

Second, the above 3 Replset nodes are not all in series, because Replset has self-detection work can automatically search to connect to other servers.

After completing the above work, to initialize the replica set, simply connect to a server to execute the following commands (priority 0~1, selected as the primary server)

>use Admin

>db.runcommand (

{"Replsetinitiate": {"_id": "Replset", "members": [{"_id": 1, "host": "",

"Priority": 1}, {"_id": 2, "host": "",

"Priority": 1}, {"_id": 3, "host": "",

"Priority": 1}]}}


Viewing the results, you can see that is automatically selected as Replset:primary>

In the addition of a node from the server

..... bin>mongod--dbpath "C:\Program files\mongodb\data\dbs\replset27020"--port 27020--replset replset/

Add new from server member to System.replset via Rs.add command

Rs.add (""); or Rs.add ({"_id": 4, "host": ""})

Replica Sets introduction is here, detailed can refer to the official website: Replica Sets

Here's a brief introduction Master Slave/replica SetsBackup mechanism, both of which are based on the primary server's oplog to achieve all synchronization from the server.

Oplog records the record information of adding and deleting operations (not including the operation of the query), but the oplog has a size limit, and when the specified size is exceeded, Oplog clears the previous record and restarts the record.

Master Slave mode The master server generates Oplog. $main Log Collection

Replica sets mode all servers will generate log collection

Under both mechanisms, all slave servers will poll the primary server for Oplog logs, and if the primary server logs are newer, these new operations records will be synchronized. But here is a very important problem, from the server due to network congestion, such as the crash can not be extremely synchronous master server Oplog record: One situation the main server Oplog constantly refreshed, so from the server can never catch up with the primary server. Another situation, just the primary server oplog out of size, empty the previous oplog, so from the server and the primary server data may be inconsistent, this second case, I was inferred, not confirmed.

In addition to explain the shortcomings of Replica sets backup, when the primary server fails, a slave server is voted to select the primary server, but this from the service Oplog if later than the primary server Oplog, before the primary server recovery, Will roll back its own oplog operation and the new primary server Oplog remain consistent. Since this process is automatically switched, it can result in some loss of data in the invisible.

3. Automatic sharding
    • Auto sharding

Automatic sharding: The original database in the collection according to a certain number of rules are divided into small pieces, these shards are unified by MONGOs routing management, when there is a request query or write, the route will be based on the Shard Shard key rules to find the corresponding Shard operation. sharding resolves write intensive operations to distribute a single write server load. or the original storage space is not enough, this time may be through the Shard operation to write the subsequent data to other storage space. As you can see, the shards of the collection are similar to the database tables, and each shard supports write operations. Because of the presence of shards, resulting in the data being distributed to different servers, when a server problem can lead to data loss, followed by routing MONGOs problems, as well as storage of information on the configuration of the Shard server may also have problems. Of course we can use the master Salve/replica sets mechanism to back up each shard, Mongos, configs. such as the Xia Guan Network configuration diagram, even if we use server cross-backup also requires a large number of server resources, so sharding is a very resource-intensive thing. Configuration Diagram of official website

First, the route reads the configuration information from config, and the dynamic increase of the Shard will be written to Config Servres via MONGOs, and the corresponding Shard is found through MONGOs when the client requests it. You can see the replica set backup mode used by shards, while Mongos/config Servrs is a multiple server configuration. (See official website: sharding), the following is a manual implementation of the process of fragmentation. (Change the environment to continue yesterday unfinished, the first 2 days in the Notebook)

1) Create Configs server

..... bin>mongod--dbpath "E:\mongodb\data\configs"--port 23017

2) Create a MONGOs server and specify a dependent configuration server (MONGOs relies on the configuration server, MONGOs query shard information is stored in configs)

...... bin>mongos--port 25017--configdb

3) Create multiple Shard servers (responsible for data storage)

...... bin>mongod--port 27017--dbpath "e:\mongodb\data\dbs\shard27017"

Create the test Library Foo collection for the shard27017 Shard server and create a name index for the Foo set

4) Connect the MONGOs server to add the shard27017 shard server to the Configs server

>use Admin

>db.runcommand ({addshard: "", allowloacl:true})//Add Shard Server, Allowloacl local deployment by default, multiple shards are not allowed to be deployed locally

Once the Shard is added successfully, execute show DBS on the MONGOs server to see the database of the Shard server, and you can manipulate the Shard server's data, set the Shard and Shard Shard key for the Test library Foo collection of the Shard server.

>db.runcommand ({"enablesharding": "Test"})//enable sharding for test library

Note: The Shard key for the collection that requires sharding must be an index key (we can also create an index on MONGOs for the Shard Foo collection)

>db.runcommand ({"Shardcollection": "", "key": {"name": 1}})//digit for sort

Now that the Auto Shard is created, you can query the Shard information on the MONGOs or configs server

When a shard server is not stored enough, by continuing to add a shard server, like 3, Monogs will work on the cluster from which these shards are implemented.

When you need to remove a shard, run the following command while the MONGOs route moves the information on the Shard server to another shard.

>db.runcommand ({"Removeshard": ""})

Simple analysis of this Shard key, when not write intensive operation, and simply because the storage space is not enough, this shard key we can choose some unlimited key, such as creation time, so that the newly created records will be written to the new Shard server.

When the need to make each shard evenly distributed data, or write dense, it is best to choose a certain range of values of key, of course, this range can not be too small, such as gender, true and false, which will result in only the automatic generation of two shards, so be sure to choose the right Shard key to achieve the desired effect.

MongoDB has a strong ability to read and write, and the configuration is more flexible and easy. Although the above mentioned features, each of which have certain shortcomings, but these shortcomings can be avoided by reasonable design. such as backup, because the backup is to consume a certain main service performance, this time can be backed up from the server, to avoid affecting the performance of the primary server. For example Oplog Although there is a size limit, we can determine a suitable oplog size by observing the primary server for a continuous period (Week/month/year) Update operation, so that the synchronization of these operations records is not lost from the server. Or, at some point of the day, force the primary server's write cache operation so that the server can catch up to the primary server synchronously. This is the end of the introduction to MongoDB. To learn more, you can follow the official documents and forums (the documentation is very detailed, the example is also very concise, even if the command is the wrong language, you can find a solution based on the prompt information, MongoDB is easy to get started).

MongoDB Series Navigation

MongoDB status Monitoring, backup replication, and automatic sharding

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.