MongoDB Backup and Recovery

Source: Internet
Author: User
Tags unique id node server mongodump mongorestore

This article mainly introduces the way of MongoDB backup and the method of recovery.
MongoDB will all the data exist under the Data Directory folder, the default is C:\data\db, we can also be configured by DBPath from. If it's just a simple backup, we just need to copy the folder. In this case we need to shut down the server and avoid the data from being out of sync.
MongoDB has three ways to back up without having to shut down the server, mongodump, master-slave copy , and replica set .

Mongodump

Mongodump is a method that can be backed up at run time, Mongodump queries the running MongoDB, and then writes the document that is being traced to disk. Because Mongodump is a client that differs from MongoDB, there is no problem in handling other requests.
However, Mongodump uses a common query mechanism, so the resulting backup is not necessarily a real-time snapshot of the server's data .

Let's take a look at how to use Mongodump to back up the database.
Start Mongodump First

The effect is as follows:

-D Specify where the database-o backup resides
You can find the backed up folder in the specified location:

Open to see the data:

It is important to note that queries on Mongodump backups can have an impact on the performance of other clients.
More operations we can query by mongodump–help

Mongorestore

Here's how to use Mongorestore to recover data .
Mongorestore gets the output of the Mongodump and inserts the backed-up data into the running MongoDB instance. Like Mongodump, Mongorestore is also a standalone client:

Using this method, put the LF collection you just backed up in the new database testnew:

-d Specifies the database to be recovered
–drop refers to deleting a collection (if present) before recovery, otherwise the data is merged with the existing collection data and may overwrite some documents.
To view the database:

show dbs use testNewdb.lf.find()

Indicates successful data recovery
You can use mongorestore–help to get help information

It is important to note that both commands operate in the Bson format, which is almost impossible to use when the amount of data is large and exceeds hundreds of G, because Bson is extremely space-intensive.

Fsync Lock

When it comes to mongodump backup, it's impossible to mention Fsync and locks. As we have just mentioned, using Mongodump Backup can not shut down the server, but it loses the ability to get a real-time view of the data. MongoDB's Fsync command is capable of replicating data directories while MongoDB is running without damaging the data.
It works by forcing the command server to write all buffers to disk, blocking further writes to the database by locking until the lock is released. Write locks are the key to making fsync useful in backup.
In the shell, Fsync is enforced and a write lock is obtained :

db.runCommand({"fsync":1,"lock":1})

At this point, the data catalog data is consistent, and is a real-time snapshot of the data. Because of the lock, you can safely copy the data directory as a backup. This is useful if the database is running on a file system with snapshot capabilities, such as Lvm,ebs, because a snapshot of the database directory is quick.
Backup OK, unlock :

db.$cmd.sys.unlock.findOne()db.currentOp()

Running db.currentop () is to ensure that it has been unlocked (it will take some time to unlock the initial request).
With the Fsync command, you can perform very flexible backups without having to stop the server or sacrificing the real-time performance of the backup. The price to pay is that some write operations are temporarily blocked .

master-slave replication

The backup method mentioned above is flexible, but it is still not backed up from the server, and when it is replicated it runs MongoDB, the earlier mentioned backup technology can be used not only on the primary server, but also on the slave server. Master-slave replication is the most common form of replication for MongoDB , and is flexible enough to be used for backup, failback, and read scaling .
The most basic way to set up is to establish a master node and one or more slave nodes, each from the node to know the address of the master node.
We do this by starting the mongod --master primary server, by starting the address of the 运行mongod --slave --source master_address primary node from the server, where master_address.
Here we use the different ports of a machine to do the testing:

first, set up the data directory for the master node :

mongod--dbpath"D:\MongoDB\data\dbs"--port10000--master


After successful creation:

then, we set the slave node :

mongod--dbpath"D:\MongoDB\data\dbs\slave"--port10000--slave--source127.0.0.1:10000

After it is created, it automatically starts listening on the port of the master node:

To connect to the primary database:

db = connect(“127.0.0.1:10000/master”)

to add data to the primary node server :

then connect from the node, querying the data from the node:

Note If you have an error, you need to use the Rs.slaveok () method.

The data from the node is consistent with the primary node, indicating that the backup was successful. All slave nodes are kept in sync with the master node, preferably not more than 12 slave nodes in a cluster.

Options

There are some configuration options in the master-slave replication process:
–only
Specify that only a specific database is replicated on the slave node ( by default, all databases are replicated )
–slavedelay
Used on the slave node, when the master node's operation is applied, increments the delay copy (in seconds) from the node. This makes it easy to set the delay from the node , such a node to the user inadvertently delete important documents or insert garbage data, such as protective effect, these bad operations will be copied to all from the node, through the delay to perform operations, can have a recovery time difference.
–fastsync
Initiates a Slave node based on the data snapshot of the master node. If the data catalog starts with a data snapshot of the master node, booting from the node with this option is much faster than doing a full synchronization.
–autoresync
Automatic resynchronization if the slave node is not synchronized with the primary node
–oplogsize
The size of the master node Oplog (in megabytes)

Replica set

In the end, the replica set is the master-slave cluster with automatic failure recovery. The difference between a replica collection master and slave cluster is that there is no fixed master node in the replica set .
The entire cluster will elect an active node , and when it does not work, it will make the other backup nodes active . That is, the replica set always has an active node (primary) and one or more backup nodes (secondary).

Initialize

Setting a replica set is a little more complicated than setting up a master-slave cluster because it requires initialization. You also have to give the replica set a name that differs from the other replica sets. Here is the name LF.
first start a node :

mongod --dbpath "D:\MongoDB\Server\3.0\bin\master" --port 10000 --replSet lf/127.0.0.1:10001

The role of the –replset here is to let the server know about the LF replica set and other companion locations in lf/127.0.0.1:10001

Next, start the other one in the same way:

mongod --dbpath "D:\MongoDB\Server\3.0\bin\slave" --port 10000 --replSet lf/127.0.0.1:10000

If you want to add a third one, two ways:

mongod --dbpath " D:\MongoDB\Server\3.0\bin\slave

or a

mongod --dbpath " D:\MongoDB\Server\3.0\bin\slave

Either way, because the replica set has automatic detection , MongoDB automatically searches for and connects the remaining nodes when a single server is specified .

Then there is the initialization operation , which initializes the replica set in the shell script:

Use Admindb.runcommand ({"Replsetinitiate":{"_id":"LF" ,//Name of the replica set"Members"://List of servers in replica set [{"_id":1,//Unique ID for each server"Host":"127.0.0.1:10000"//Specify the server's host}, {"_id":2,"Host":"127.0.0.1:10001" }      ]    }  })

After the initialization succeeds, it is in the active node and we can view the status of each node through rs.status () :

Next we add data to the active node to test:

Then connect the backup node and make the data query:

A successful query to the data indicates that the backup was successful.
If the backup node is queried, the
Error: {"$err": "Not Master and Slaveok=false", "Code": 13435} error.
Execute the following statement:
Db.getmongo (). Setslaveok ()

node

The functionality of a replica set can be more than just a backup. There are several different types of nodes that can exist with the replica set:
1. Standard node
This is a regular node, it stores a complete copy of the data, participating in an election vote may become an active node
2. Passive passive node
A full copy of the data is stored, participating in the poll, not being an active node
3. Arbiter Arbitrator
Arbitrators can only participate in voting, do not receive replicated data, or become active nodes

We can modify the priority key in the node configuration to prioritize the nodes :

>members.push({"_id":3,"host":"127.0.0.1:10002","priority":40})

The default priority is 1, which can be 0-1000.

We can also specify the quorum node through the arbiteronly key:

>members.push({"_id":4,"host":"127.0.0.1:10003","arbiterOnly":true})

When the active node fails, the remaining nodes elect a new active node, and the election is initiated by any inactive node. the quorum node was added to avoid gridlock because it would not be involved in the campaign itself. The new active node will be the node with the highest priority, the same priority, and the newer node wins the data.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

MongoDB Backup and Recovery

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.