Section 6 MongoDB status monitoring, backup replication, and automatic partitioning

Last Update:2018-12-07 Source: Internet

Author: User

Tags mongodump mongo shell

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

ArticleDirectory

If MongoDB is just a document-type database, there is no bright spot. However, the biggest advantage of MongoDB is read extension, hot backup, fault recovery, and automatic sharding (write extension ). These functions will be introduced at the end of this series.

1. Status Monitoring

2. Backup and Replication

3. Automatic sharding

Backup replication achieves database backup, read/write splitting, and load balancing of read operations, that is, one master write server, multiple slave backup and read servers, it also supports expansion of backup and read clusters. The replica sets mode supports failover. When the master server is down, a slave server is elected to replace the master server for write operations. The automatic sharding function automatically partitions the original set (table) to other servers for distributed storage. This reduces the amount of data in a single table and balances the load of write operations.

Status Monitoring

HTTP Console

Mongo shell diagnostic commands

Backup and Replication

Backup/restore

Master-slave

Replica sets

Automatic sharding

Auto sharding

1. Status Monitoring

First, create data, data \ DBS (database file directory), data \ DBS \ master (master server directory) in the MongoDB directory respectively ), data \ DBS \ slave (slave server directory ).

Create a MongoDB database service using the default port

In the last two lines, the MongoDB service is listening to port 27017 by default, and port 28017 (listening port + 1000) is the Web Admin Interface listening port, which is the HTTP console monitoring port.

HTTP Console

Visit the http: // localhost: 28017/address to check the effect. add several records to the default dB. Foo data first.

Run cmd.exe (the test database is connected by default, and there is a test set Foo)

Access the HTTP Console (http: // localhost: 28017/) to view the monitoring results.

Note the red mark. The backup copy marked after the first mark will be explained, followed by the operation log.

Mongo shell diagnostic commands

Then, use the Mongo shell script to query the server status.

The preceding two simple monitoring methods are available. For more information, see Monitoring and diagnostics.

2. Backup and Replication

Regardless of the database, data backup and replication and Failover are considered. When the read/write ratio of some database servers is high, we also need to consider implementing the load balancing and other functions of these database servers. Let's take a look at how MongoDB implements these features.

Backup/restore

When creating the MongoDB service, you can specify the Directory through -- dbpath to store the mongdb database file directory. We can copy these files to implement cold backup of the database, but this method is not safe. Therefore, before the cold backup, you need to shut down the server. This section describes how to smoothly shut down the server.

> Use Admin

> DB. shutdownserver ()

Alternatively, you can use fsync to write data to the cache and then copy and back up the data.

> Use Admin

> DB. runcommand ({"fsync": 1, "Lock": 1 })

At this time, I went to test. foo inserts a data entry F: 6, which is executed in dB. foo. after finding (), this record is not found, indicating that the record is not directly written to the database, but is buffered into the cache.

Unlock after backup (to prevent data loss that is not cached due to power outages or other causes)

> Use Admin

> DB. $ cmd. SYS. Unlock. findone ()

> DB. currentop () If currentop returns only {"inprog": []} results, the unlock is successful.

After unlocking, the data in the cache will be written to the database file. Let's query the foo result.

The above is the cold backup method. We can use two tools provided by MongoDB to implement backup and recovery without stopping the service. The two tools are displayed in the bin directory of MongoDB.

Mongodump.exe/mongorestor.exe

The principle of mongodump.exe backup is to get the current server snapshot through a query and write the snapshot into the disk. Therefore, this method is not saved in real time, because after obtaining the snapshot, the server also has data writing. To ensure backup security, we can also use the fsync lock to temporarily write server data into the cache.

Backup command:

... Bin> mongodump-d test-O backup // (backup is the Backup Directory, which is created in the bin directory by default)

Recovery command: (you can insert a record G: 7 in the recovery to the foo table)

... Bin> mongorestore-d test -- drop backup/test/

Let's take a look at the running result:

The above is the MongoDB backup and recovery process. When the database file is faulty or damaged, MongoDB also provides the command to fix the data file.

Modify the value through -- repair when starting the mongod Service

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ master" -- repair

In addition, we can repair running data inventory in Mongo shell.

> Use test

> DB. repairdatabase ()

For details, refer to this section on the official website: backups

Let's take a look at the other two extended read backup mechanisms.

Master-slave

Master-slave replication mode: one master write server and multiple slave backup servers. Backup and read expansion can be implemented on the slave server to share the pressure on the master server when reading the password set and act as the query server. However, when the primary server fails, we can only manually switch the backup server to replace the primary server. This flexible method makes it easier to expand more than backup or query servers. Of course, query servers are not infinitely scalable because these slave servers regularly read updates from the master server in polling mode, when there are too many slave servers, it will overload the master server.

We use the previously created port 27017 as the master server, and then create a port 27018 slave server.

Restart master server 27017-master server

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ master" -- master

Create 27018 as the slave server -- source to specify the master server

... Bin> mongod -- Port 27018 -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ slave27018" -- slave -- source localhost: 27017

The master server can view the list of slave servers in the slave set of its local database.

The slave server can view the master server information or maintain multiple master servers through the source set of the local database. (One slave server can serve multiple master servers)

Or we can view the status through the HTTP Console

For details, refer to the official website: master slave

Replica sets

Replica set mode:Master-slaveAll features of the mode, but the replica set does not have a fixed master server. during initialization, a master server is elected by multiple servers. When the primary server fails, a new primary server will be elected by voting again, and the original primary server will be switched to the slave server after recovery.Replica setsThe automatic switch mechanism in case of a fault can guarantee write operations in extreme cases.

Create multiple replica set nodes -- replset (note that they are case sensitive. We recommend that you use an IP address in the namespace)

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ replset27017" -- Port 27017 -- replset/127.0.0.1: 27018

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ replset27018" -- Port 27018 -- replset/127.0.0.1: 27017

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ replset27019" -- Port 27019 -- replset/127.0.0.1: 27017

The first three servers are created to ensure that the voting will not conflict. If the server is an even number, the primary server may not be properly elected.

Next, not all the above three replset nodes are connected in series because replset can automatically search and connect to other servers for self-detection.

After completing the above work, you need to initialize the replica set and connect to a server to execute the following command (Priority 0 ~ 1. Priority of the selected master server)

> Use Admin

> DB. runcommand (

{"Replsetinitiate ":{

"_ Id": "replset ",

"Members ":[

{

"_ Id": 1,

"Host": "127.0.0.1: 27017 ",

"Priority": 1

{

"_ Id": 2,

"Host": "127.0.0.1: 27018 ",

"Priority": 1

{

"_ Id": 3,

"Host": "127.0.0.1: 27019 ",

"Priority": 1

}]}

)

The result shows that 127.0.0.1: 27017 is automatically selected as replset: Primary>

Add a slave server node

... Bin> mongod -- dbpath "C: \ Program Files \ MongoDB \ data \ DBS \ replset27020" -- Port 27020 -- replset/127.0.0.1: 27017

Use the Rs. Add command to add a new slave server member to system. replset.

Rs. Add ("127.0.0.1: 27020"); or Rs. Add ({"_ id": 4, "host": "127.0.0.1: 27020 "})

Replica setsThe introduction is here. For details, refer to the official website: replica sets.

Here is a brief introduction Master Slave/replica sets Backup mechanism. Both modes are based on the oplog of the master server to synchronize all slave servers.

Oplog records the information of the added, deleted, and modified operations (excluding the query operations), but the oplog has a size limit. When the size exceeds the specified size, oplog clears the previous records and starts the record again.

Master slaveThe master server and slave server will generate a collection of oplog. $ main logs.

Replica setsAll servers generate a collection of oplog. RS logs.

Under the two mechanisms, all slave servers round-robin The oplog logs of the master server. If the master server logs are newer, these new operation records will be synchronized. However, there is a very important issue here. When the slave server is unable to sync the oplog record of the master server due to network congestion or crashes: in one case, the master server keeps refreshing the oplog, in this way, the slave server will never catch up with the master server. In another case, the primary server's oplog size is exceeded and the previous oplog is cleared, so that the slave server and the primary server's data may be inconsistent. In the second case, I inferred that it was not confirmed.

Note:Replica setsBackup disadvantages: when the master server fails, a slave server is elected as the master server. However, if the oplog of this slave server is later than the previous primary server oplog, after the previous master server is restored, it will roll back its own oplog operations and maintain consistency with the oplog Of the new master server. Because this process is automatically switched, some data loss is invisible.

3. Automatic sharding

Auto sharding

Automatic sharding: the original database is integrated into several small parts based on certain rules. These small parts are centrally managed by mongos routing. When a request is queried or written, the route will find the corresponding sharding operation based on the sharding shard key rule. Sharding solves write intensive operations and is used to distribute loads on a single write server. Or the original storage space is insufficient. In this case, the data may be written to another storage space through the partition operation. It can be seen that the sharding of a set is similar to that of a database Shard, and each shard supports write operations. Due to the emergence of shards, data is distributed and stored on different servers. When a server encounters a problem, data may be lost. Second, the routing of mongos may also cause problems, in addition, the configuration server storing the part information may also have problems. Of course, we can use the master salve/replica sets mechanism to back up each shard, mongos, and configs. As shown in the following configuration illustration on the official website, even if we use cross-backup servers, a large amount of server resources are required. Therefore, sharding is a resource-consuming task. Configuration icons on the official website

First, the route reads the configuration information from config. The dynamic increase of shards will also be written to config servres through mongos. When the client has a request, it will find the corresponding shard through mongos. We can see that the replica set backup mode is used for sharding, while mongos/config servrs is configured on multiple servers. (For details, see sharding on the official website.) The following describes how to manually implement the sharding process. (Changed the environment and continued to be in The Notebook two days ago)

1) create a configs Server

... Bin> mongod -- dbpath "E: \ MongoDB \ data \ configs" -- Port 23017

2) create a mongos server and specify the dependent configuration server (mongos depends on the configuration server, and the shard information queried by mongos is stored in configs)

... Bin> mongos -- Port 25017 -- configdb 127.0.0.1: 23017

3) create multiple partition servers (responsible for data storage)

... Bin> mongod -- Port 27017 -- dbpath "E: \ MongoDB \ data \ DBS \ shard27017"

Create the test database Foo set for the shard27017 partition server and create a name index for the foo set

4) connect to the mongos server and add the shard27017 server to the configs server.

> Use Admin

> DB. runcommand ({addshard: "127.0.0.1: 27017", allowloacl: true}) // Add a partition server. allowloacl is sufficient for local deployment by default.

Once the Shard is successfully added, run show DBS on the mongos server to view the database of the shard server and operate the data of the shard server, next we will set the shard and shard key for the foo set of the Test Database of the shard server.

> DB. runcommand ({"enablesharding": "test"}) // enable the sharding function for the test database.

Note: The shard key of the set to be sharded must be the index key. (You can also create an index for the foo set in mongos)

> DB. runcommand ({"shardcollection": "test. foo", "key": {"name": 1}) // numbers represent sorting

Now the automatic sharding is created. You can query the sharding information on the mongos or configs server.

When the storage of a shard server is insufficient, add the shard server in the way of 3). monogs will automatically implement the sharding cluster.

When you need to remove a shard, run the following command and the mongos route will move the information on the shard server to another shard.

> DB. runcommand ({"removeshard": "127.0.0.1: 27017 "})

Simply Analyze the shard key. When it is not a write-intensive operation, it is only because the storage space is insufficient. We can select Keys with no upper limit, such as the creation time. In this way, all newly created records are written to the new shard server.

When you need to evenly distribute data for each shard, or when writing data to a password set, you 'd better select a key with a fixed range value. Of course, this range cannot be too small, such as gender, true or false, this will automatically generate only two shards, so you must select the appropriate shard key to achieve the desired effect.

MongoDB has powerful read/write scalability and is easy to configure. Although each of the features mentioned above has some shortcomings, these shortcomings can be avoided through reasonable design. For example, backup consumes a certain amount of performance of the master server. In this case, you can back up the slave server to avoid affecting the performance of the master server. For example, although the oplog size is limited, we can determine an appropriate oplog size by observing the master server update operation for a period of time (Weeks, months, or years, so that the slave server will not lose the synchronization of these Operation Records. Or force the master server to write data to the cache for a certain period of time every day, so that the slave server can catch up with the master server synchronously. This is the end of getting started with MongoDB. For more details, refer to the official documents and forums (the documents are very detailed and the examples are very concise. You can find a solution based on the prompt information even if the command is incorrect, mongoDB is easy to get started ).

MongoDB navigation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More