Analysis of MONGODB data loss problem

Source: Internet
Author: User

There are many legends that MongoDB will lose data. In particular, there is a recent infoq translation of Sven's hydrology (why is it called hydrology?). Because it does not have his own original, just a few online blog, fried some cold to eat), which also mentioned the data lost things. As you know, as a database, the persistence of data is basically the minimum requirement of the database. If MongoDB really has such a bad data security problem, it has long been in the technical choice of many today was ruthlessly eliminated. So what is the truth about it?

To be realistic, MongoDB does in its development process, there are some data persistence problems are not handled well, especially some default values are selected. Most users will use it until they encounter a problem and then find that they should do the necessary configuration at the beginning. However, all of these problems have been found, the default settings, have been properly resolved after MongoDB 2.6. I can responsibly tell you that the data security issues you see are basically a 2.4 or a previous version of the problem or a user-configured issue. Let's take a closer look at MongoDB's data security mechanism to better understand why there is a problem with data loss and how to properly configure MongoDB to keep the data safe.

MongoDB data security consists of the following concepts:

    • Recovery log (Journal)
    • Write attention (write Concern)
Recovery log

In MySQL, postgresql,oracle and other relational databases, there is a mechanism for write Ahead log (Redo log) to resolve memory data loss due to system power-down or crashes. MongoDB's Journal is a Wal-log for this purpose. Prior to MongoDB 2.0, Journal was not supported or was not a default option. So when you do the write operation. In the absence of journal, MongoDB saves the data in this way:

Simply put, the data is returned to the application immediately after it is written to memory. and the data Brush disk action in the background by the operating system. MongoDB will force the data to be brushed to disk every 60 seconds. As you can imagine, if the system crashes or loses power at this time, the data of the non-brushed disk will be completely lost. If you see the blog is about 2011 years, it is basically encountered this situation.

Since the beginning of 2.0, MongoDB has set the journal log to open by default.

In the case, MongoDB writes the data update to journal buffer and then updates the memory data before returning it to the application side. The journal will be batch-brushed to disk at 100ms intervals. In this case, even if the power loss data has not been saved to the file, because of the existence of the journal file, MongoDB will automatically according to the operation history inside the journal to re-append the data file.

A careful classmate may notice that the journal file is a 100ms brush disc at a time. So what if the system power-down just happened after the last brush journal 50ms? At this point, we can look at the next concept of MongoDB persistence: Write attention

Write attention (write Concern)

Writing attention (or translating to write security) is a feature that is unique to MongoDB. It gives you the flexibility to specify persistent settings for your write operations. This is a tradeoff between performance and reliability. There are several levels of write attention:

{w:0} unacknowledged

Unacknowledged means that for each write operation, MongoDB does not return a successful status value. This level is the best and least secure level of write performance. For example, if you try to insert a document that violates uniqueness (a duplicate ID number), MongoDB refuses to write and error. But because the driver does not care about your error, the application is filled with joy to think that everything is all right, the next time you look at that data, there will be a data loss situation.

There are times when MongoDB is used to keep some of the monitoring and program log data, this time if you have 1 or 2 data loss, it will not have any impact on the application. Based on the early immaturity of these mongodb considerations, the default setting for MongoDB before 2.2 is {w:0}. This is a great choice for MongoDB to regret, because this is the root cause that many people feel that mongodb data is unsafe.

In MongoDB 2.4, this setting has been changed to the following {w:1}

{W:1} acknowledged

Acknowledged means that each write to MongoDB confirms the completion state of the operation, whether successful or failed. This acknowledgment is, of course, just a memory write based on the master node. However, this level can detect errors such as duplicate primary keys, network errors, system failures, or invalid data.

Since the 2.4 release, the default write security setting for MongoDB is {w:1} acknowledged. In this case, the loss of data due to system failure will only be the case of the log we mentioned earlier that did not brush the disk in time. If you cannot accept a possible 100ms loss of data due to a system crash, then you can choose the next level: {j:1} journaled

{J:1} Journaled

Using this method means that each write operation will not return until MongoDB actually journal the disk. Of course, this does not mean that every write operation equals an IO. MongoDB will not immediately brush the disk for each operation, but will wait up to 30ms, the write operations in the 30ms together, in order to append the way to write to the disk. Within this 30ms, the client thread will be in a wait state. This will increase the overall response time for a single operation, but for high concurrency scenarios, the combined average throughput and response time will not have much impact. Especially if you can deploy a dedicated storage system to journal with an optimized IO bandwidth for sequential writes, this impact on performance can be minimized.

Is it safe to use {j:1} for 100%? In the case of a standalone version, this is basically guaranteed (unless the hard disk is damaged). However, in the case of replica sets, we also need to consider a higher level: {w: "Majority"}

{w: "majority"} writes to most nodes

The default deployment for MongoDB is a replication set (Replicaset) of at least 3 nodes. The benefits of using replication sets are many, and the key is to improve the high availability of the system. Another benefit is the persistence of the data. Under the replica set even if your entire host is connected to a memory with a hard drive, your data is still healthy on the second or nth node. But the replication set as a distributed architecture also presents a new challenge to our data consistency. Taking the above {w:1} Write security configuration as an example, we analyze a more complex scenario.

    • 01:00:00 network failure, network disconnection between master and slave
    • 01:00:01 app writes to a document: {ts: "01:00:01″} Note This document cannot be copied to B and C. The master node has not yet fully confirmed that the network has failed, so continue to accept and confirm the write according to the {w:1} rule.
    • 01:00:02 Master Node A realizes that it cannot communicate with the node B,c, actively demote to slave node, stop accepting write operation
    • 01:00:05 b,c election result succeeded, B upgrade to primary node. B begins accepting write operations. {ts: "01:00:06″}
    • 01:00:08 Network Recovery, a rejoin the cluster. At this time A's oplog and B's oplog have been inconsistent. A will take the initiative to roll back the non-existent write operation (rollback) and write a rollback file.

In this case, if you go back to query {ts: "01:00:01″} This document, MongoDB will say that the document does not exist!"

What should I do? {w: "majority"} is our answer. "Majority" refers to "most nodes". With this write security level, MONGODB returns an acknowledgment to the client only if the data has been replicated to more than a few nodes.

Let's take a look at the following: After using {w: "Majaority"}, the situation has just changed to

    • 01:00:00 network failure, network disconnection between master and slave
    • 01:00:01 application requires a document to be written: {ts: ' 01:00:01″} The document is first successfully written to the primary node. However, this document cannot be copied to B and C due to network disconnection. Because the {w: "majority"} requirement was not met, the document was not written successfully from the point of view of the application.
    • 01:00:02 Master Node A realizes that it cannot communicate with the node B,c, actively demote to slave node, stop accepting write operation
    • 01:00:05 b,c election result succeeded, B upgrade to primary node. B begins accepting write operations. {ts: "01:00:06″}
    • 01:00:08 Network Recovery, a rejoin the cluster. At this point a will produce a rollback and delete the {ts: "01:00:01″} document. At this point the data state of the cluster is consistent and correct.

At this point, if you use {w: "Majority", j:1}, then MongoDB can meet all levels of data persistence requirements. It is noteworthy that in May 2013 Kyle kingsly published a blog call Me Maybe: in this article Kyle reported some about { W: "Majority"} bug, these bugs have been resolved in 2.6. Of course, like Sven, such as the grandstanding, not to study the 3.0 whether there is really a problem, but Google a few years ago something to make a fuss.


In general, MongoDB recommends using the {w: "Majority"} setting in the cluster. In the case where a cluster is a robust deployment (such as: Enough network bandwidth, the machine is not full), this can satisfy most of the data security requirements, because MONGODB replication is normally millisecond-level, often before the journal brush disk has been copied to the slave node. If you pursue perfection, you can use {j:1} further. The combination of the two

Legend has it that the things that MongoDB lost data have indeed become legends.

PostScript: At the time of writing this article, the community has been reporting the problem of data loss. It is said that every 10,000 records will be lost one or two records. In this case, my first reaction is: Check your code, many times often problems in the program. Sure enough, after careful examination, the original is the problem of the code.

Analysis of MONGODB data loss problem

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.