MongoDB conventional backup policy

Source: Internet
Author: User

MongoDB conventional backup policy
MongoDB backup is actually a basic operation. Someone has been asking about it recently. It seems that many people are not familiar with it yet. In order to avoid repeated explanations, We will summarize a blog for future reference. Correct the error if any. 1. the built-in method 1.1 does not need to be explained to copy database files. It is useful to almost any database and is simple and crude. However, like most databases, this operation must be performed when the mongod instance is stopped to ensure that the database is in the correct state. Otherwise, write operations may cause the database to be backed up to be abnormal and unavailable. The database must be stopped, which results in low availability and limited use cases. 1.2 mongodump/mongorestore I don't want to elaborate on how to use these two commands, because there are already a lot of articles about these two commands on the Internet. You can easily find instructions elsewhere or in official documents. Another reason I don't want to elaborate on is that I hope that the readers here can develop the ability to solve problems independently. Don't rely on the so-called "experts". If you encounter problems, think about how to solve them independently, this will be one of your essential skills in the future. The following describes some things that are rarely mentioned by others. 1.2.1 apart from mongodump/mongorestore, what is the difference between a pair of combinations of export/Export Import? Export export/Export Import/Export is in JSON format, while mongodump/mongorestore imports/exports is in BSON format. JSON is readable but bulky. BSON is a binary file, which is small but almost unreadable to humans. In some mongodb versions, the BSON format may vary with different versions. Therefore, mongodump/mongorestore may not be used between different versions, depending on the compatibility between versions. When you cannot use BSON for cross-version data migration, using the JSON format, that is, export/Export Import is optional. It is not recommended for cross-version mongodump/mongorestore. Check the document to see if the two versions are compatible (most of the time ). Although JSON has good cross-version versatility, it only retains the data portion, does not retain indexes, accounts and other basic information. Pay attention to it during use. In short, these two sets of tools have their own advantages in actual use and should be selected based on the Application Scenario (as if not mentioned ). Strictly speaking, export/Export Import is mainly used for data import/export, and is not a real backup tool. So we will not discuss it here. 1.2.2 mongodump has an option worth mentioning: oplog note that this is dedicated to the replica set or master/slave Mode (it is not recommended to run mongodb in standalone mode ). -- Oplog use oplog for taking a point-in-time snapshot: a point-in-time snapshot, the first time I saw this sentence, I understood that it could bring the database back to any point in time during this period, which was a good time. But not actually. It is used to generate an oplog. bson file at the same time as the export, and store all the oplogs between dump and dump. What is the specific purpose of this item? Sell it first. Use graphs to describe the coverage of oplog. bson: for the sake of further explanation, we will explain what oplog is and its related concepts. The official documents have been fully explained. Click to view the Chinese or English documents. To put it simply, in the replica set, oplog is a capped collection, and its default size is 5% of the disk space (which can be modified using the -- oplogSizeMB parameter ), db in the local database. oplog. rs. If you are interested, you can see what is in it. All Database changes (insert, update, and delete) within a period of time for the entire mongod instance are recorded. When the space is used up, the new record automatically overwrites the oldest record. So from the timeline, the coverage of oplog is like this: its coverage is called the oplog time window. Note that, because oplog is a set of capacity, the range of Time Window coverage varies according to the number of updates per unit time. To view the expected value of the current oplog time window, run the following command: test: PRIMARY> rs. printReplicationInfo () configured oplog size: 1561.5615234375 MB <-- SET size log length start to end: 423849 secs (117.74hrs) <-- estimated window coverage time oplog first event time: wed Sep 09 2015 17:39:50 GMT + 0800 (CST) oplog last event time: Mon Sep 14 2015 15:23:59 GMT + 0800 (CST) now: Mon Sep 14 2015 16:37:30 GMT + 0800 (CST) oplog has an important feature-idempotence (idempotent ). That is, when a data set is replayed using the operation recorded in oplog, the result will be the same no matter how many times it is replayed. For example, if the oplog records an insert operation and you do not replay it twice, the database will get two identical records. This is an important feature and the basis for subsequent operations. Go back to the topic and see what role oplog. bson has. One of the first problems to understand is that data is mutually dependent. For example, set A stores orders and Set B stores all order details, the status is correct only when an order has complete details. Assume that the data in A and B sets are completely corresponding and meaningful at any time point (it is not easy for non-relational databases to do this, this data structure is not reasonable for MongoDB. But here we assume this condition is true), if A is at the time point x, and B is at A time point y after x, it can be imagined that the data in A and B is likely to be irrelevant and meaningless. Let's look at the mongodump operation. Mongodump does not lock the database to ensure that the entire database is frozen at a fixed time point, which is often not allowed in the business. So in the final result of dump, set A is in the status of, and set B is in the status of and. Even if such a backup is restored, the expected results may be of limited significance. The meaning of the above oplog. bson is shown here. If you redo all the operations recorded in the oplog Based on the dump data, then the data can represent the point-in-time point at the end of the dump) database status. An important condition for this conclusion is idempotence: Existing data does not repeat the redo oplog, and non-existing data can redo the oplog to enter the database. So when we finish the oplog at a certain time point, the database will return to the status at that time point. Let's take a look at the mongorestore options. Oplog-related options include -- oplogReplay and -- oplogLimit. The first option, as its name suggests, can replay the operation content in oplog. bson. The second option will be introduced later. Let's take a look at an example: First, we simulate a set of foo, use testfor (var I = 0; I <100000; I ++) {db. foo. insert ({a: I});} then simulate mongodump once during the insert Process and specify -- oplog. Mongodump-h 127.0.0.1 -- oplog note -- The oplog option is only valid for full-Database Export, so the-d option cannot be specified. Because the entire instance change operation will be concentrated in the oplog. rs collection in the local database. According to the above, the system will record all the oplogs to oplog. bson starting from dump, so we get these files: yaoxing ~ $ Ll dump/total 440-rw-r -- r -- 1 yaoxing 442470 Sep 14 oplog. bsondrwxr-xr-x 2 yaoxing 4096 Sep 14 test where test is the database we just used, oplog. bson is all operations performed during the export. For oplog. the content in bson is curious. You can use the bsondump tool to view the content, for example: {"h": {"$ numberLong": "2279811375157953332"}, "ns": "test. foo "," o ": {" _ id ": {" $ oid ":" 55f834ae6b530b5854f9d6ee "}," a ": 7784.0}," op ":" I ", "ts": {"$ timestamp": {"t": 1442329774, "I": 3248 }}, "v": 2} from oplog. in bson, we select the first and last content to observe {"h": {"$ numberLong": "2279811375157953332"}, "ns": "test. foo "," o ": {" _ id ": {" $ oid ":" 55f834ae6b530b5854f9d6ee "}," a ": 7784.0}," op ":" I ", "ts": {"$ timestamp": {"T": 1442329774, "I": 3248 }}, "v": 2 }... {& quot; h & quot;: {& quot; $ numberLong & quot;: & quot;-1177358680665374097 & quot;}, & quot; ns & quot;: & quot; test. foo "," o ": {" _ id ": {" $ oid ":" 55f834b26b530b5854f9fa5e "}," a ": 16856.0}," op ":" I ", "ts": {"$ timestamp": {"t": 1442329778, "I": 1361 }}, "v": 2} the red part can be seen, from the start of mongodump, the cycle goes to I = 7784, and to the end of the entire operation, the cycle goes to I = 16856. Let's take a look at test/foo. the last piece of data in bson {"_ id": {"$ oid": "55f834ae6b530b5854f9d73d"}, "a": 7863.0} can be found, the data dumped in the end is neither the Initial State nor the final state, but a random state in the middle. This is precisely because the set is constantly changing. Use mongorestore to restore: yaoxing ~ $ Mongorestore-h 127.0.0.1 -- oplogReplay dump2015-09-19T01: 22: 20.095 + 0800 building a list of dbs and collections to restore from dump dir2015-09-19T01: 22: 20.095 + 0800 reading metadata for test. foo from 2015-09-19T01: 22: 20.096 + 0800 restoring test. foo from 2015-09-19T01: 22: 20.248 + 0800 restoring indexes for collection test. foo from metadata2015-09-19T01: 22: 20.248 + 0800 finished restoring test. fool (786 4 documents) 2015-09-19T01: 22: 20.248 + 0800 replaying oplog2015-09-19T01: 22: 20.463 + 0800 done pay attention to the two sentences of the red letter, the first sentence indicates test. 7864 documents are restored in the foo collection. The second sentence indicates that all operations in the oplog are replayed. Therefore, in theory, foo should have 16857 documents (7864 From foo. bson and the rest from oplog. bson ). Verify that test: PRIMARY> db. foo. count () 16857 is the role of mongodump with oplog. 1.2.3 oplog comes from somewhere else. As you may have thought, since the data dumped with oplog can be used to restore the database to a certain state, is there a dump data backed up from a certain point in time, plus the oplog after the dump, if the oplog is long enough, is it possible to restore the database to any state after it? Yes! In fact, the replica set depends on the oplog replay mechanism. When secondary joins the replica set for the first time, the initial sync is equivalent to mongodump. After that, only the oplog needs to be synchronized and replayed continuously. the data in rs achieves the synchronization between secondary and primary. Since oplog has always existed in oplog. rs, why do we need to specify -- oplog during mongodump? Can I get it from oplog. rs if necessary? The answer is yes. You can only dump data without oplog. You can retrieve data from oplog. rs as needed. But the premise is that the oplog time window (if you forget the concept of the time window, please flip forward) must be able to overwrite the dump start time. With this understanding, in theory, as long as our mongodump is done frequently enough, we can ensure that the database can be restored to any point in time in the past. The pay-as-you-go backup of MMS (now called Cloud Manager) works exactly with this principle. Assuming that the oplog time window is 24 hours, theoretically, as long as I complete the dump once every 24 hours, the point-in-time data recovery after the dump can be guaranteed. When the oplog time window is about to slide out for 24 hours, as long as the next dump is completed in time, there will be another 24-hour security period. To perform a test. Still use the previous method to simulate the database insertion Operation for a period of time: use testfor (var I = 0; I <100000; I ++) {db. foo. insert ({a: I});} perform a mongodump at the same time without the -- oplog: yaoxing ~ /Dump $ mongodump-h 127.0.0.1 2015-09-24T00: 06: 11.929 + 0800 writing test. system. indexes to dump/test/system. indexes. bson2015-09-24T00: 06: 11.929 + 0800 done dumping test. system. indexes (1 document) 2015-09-24T00: 06: 11.929 + 0800 writing test. foo to dump/test/foo. bson2015-09-24T00: 06: 11.963 + 0800 done dumping test. foo (11162 events) shows the status when I = 11162 is exported. After the insertion is complete, theoretically our foo set should have 100000 records> use testswitched to db test> db. foo. count () 100000 now suppose I had a misoperation:> db. foo. remove ({}) WriteResult ({"nRemoved": 100000}). Even worse, I inserted a data entry into foo> db. foo. insert ({a: 100001}); WriteResult ({"nInserted": 1}) How can we roll back time and return to the State before a disaster? We had a dump before the disaster, so now we only need to rescue the oplog before the oplog time window can overwrite the export time: yaoxing ~ $ Mongodump-h 127.0.0.1-d local-c oplog. rs 2015-09-24T00: 09: 41.040 + 0800 writing local. oplog. rs to dump/local/oplog. rs. bson2015-09-24T00: 09: 41.525 + 0800 done dumping local. oplog. why does rs (200003 documents) have 200003 records? Use the bsondump tool to check what happened. As mentioned above, you can use mongodump and oplog. bson (pay attention to the file location) to restore the database. Here, dump/local/oplog. rs. bson is actually the oplog. bson we need. Therefore, after renaming it, place it in the appropriate location. a simulated recovery environment is ready for yaoxing ~ /Dump $ lltotal 18464-rw-r -- r -- 1 yaoxing 18900271 Sep 24 oplog. bsondrwxr-xr-x 2 yaoxing 4096 Sep 24 test but this oplog. bson includes all disaster operations. If the whole process is recovered, it will be like turning back time first and then repeating the tragedy again. My heart is broken ...... Now you need a new friend, oplogLimit, mentioned above. It can be used with -- oplogReplay to limit the replay time. Then the important question is how to find the time point of the disaster. It is still bsondump. If you are familiar with Linux commands, you can directly operate them in the pipeline. If not, dump it to the file first, and then open it in a text editor. We need to find "op": "d", which indicates that a delete operation is performed. It can be found that there are 100000 delete operations in oplog. bson, which is actually one by one to delete the record, which is why the remove ({}) operation is so slow. If you perform drop () on a set, it will be much faster. You can try the operation on your own. Yaoxing ~ /Dump $ bsondump oplog. bson | grep "\" op \ ": \" d \ "" | head {"B": true, "h": {"$ numberLong": "5331406427126597755 "}, "ns": "test. foo "," o ": {" _ id ": {" $ oid ":" 5602cdf1befd4a4bfb4d149b "}}," op ":" d "," ts ": {"$ timestamp": {"t": 1443024507, "I": 1 }}, "v": 2 }... in this record, what we need is the red $ timestamp, which represents the time when this operation occurred, which is exactly the time that we -- oplogLimit needs to pass in, but the format is slightly changed: yaoxing ~ $ Mongorestore-h 127.0.0.1 -- oplogReplay -- oplogLimit "1443024507: 1" dump/2015-09-24T00: 34: 09.533 + 0800 building a list of dbs and collections to restore from dump dir2015-09-24T00: 34: 09.534 + 0800 reading metadata for test. foo from 2015-09-24T00: 34: 09.534 + 0800 restoring test. foo from 2015-09-24T00: 34: 09.659 + 0800 restoring indexes for collection test. foo from metadata2015-09-24T00: 34: 09.659 + 0800 fin Ished restoring test. foo (11162 events) 2015-09-24T00: 34: 09.659 + 0800 replaying oplog2015-09-24T00: 34: 11.548 + 0800 done where 1443024507 is "t" in $ timestamp, and 1 is "I" in $ timestamp ". In this way, the oplog will be replayed before this time point, that is, the first Delete statement and subsequent operations are avoided, and the database remains in the State before the disaster. Verify: rs0: PRIMARY> db. foo. count () 1000001.3 summary combined with the previous knowledge, we can summarize some mongodb backup principles (only for replica or master/slave) to meet these criteria, mongoDB can perform point-in-time recovery: The interval between any two data backups (from the first backup to the second backup) cannot exceed the oplog time window. On the basis of the last data backup, The oplog Time Window does not slide out the time point at which the last backup was completed before the complete oplog backup. Fully consider the time required for oplog backup, and weigh the server space to determine the oplog backup interval. Note in practical application: considering that the oplog time window is a variable value, pay attention to the specific time of the oplog time window. You must have enough time to dump out the required oplog. rs before sliding out the effective time near the oplog time window. Please reserve enough time and do not back up the time window at full time. When a disaster occurs, the first thing is to stop writing data to the database. In the past, oplog slipped out of the time window. Especially for the remove ({}) operation like above, a large number of d records will be inserted in an instant, resulting in the oplog to quickly slide out of the time window. 2. external method 1. MMS/Cloud Manager is previously called MMS. The monitoring function is free of charge and the backup function is charged. In the first two months, the system changed its name to Cloud Manager and started charging for the monitoring function. In addition, an O & M automation robot was added, it's silly. If you are interested, you can play it. However, since it is billed, I believe that China's national conditions have determined that few companies may be willing to use it. Therefore, we will not focus on it, as long as we know that it can provide data recovery at any point in time. It is very powerful and is welcomed by tuhao. 2. Common disk snapshots, such as LVM snapshots, can be used to back up MongoDB, but the Journal of MongoDB must be enabled and the Journal must be on the same volume as the data file. Should most people start Journal? If there is no Journal, a power failure will lose up to 60 s of data. Snapshot is actually a faster backup method than mongodump/mongorestore. It is quite practical, but it can only provide data recovery at the backup time. If you are interested, refer to the official documentation. Theoretically, snapshots can also be combined with oplog replay to achieve point-in-time data recovery. However, I have no hands-on experiments. If you are interested, you can play and introduce them.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.