MongoDB's journaling log function is not the same as the common log log, MongoDB also has log logs, it simply records the database on the server startup information, slow query records, database exception information, client and database server connection, disconnection and other information. The journaling log function is a very important function in MongoDB, which guarantees the data integrity of the database server in the event of accidental power outage, natural disaster, etc. Although MongoDB also provides backup measures such as other replication sets (which will be analyzed later), journaling's functionality is indispensable in a production environment, relying on a small CPU and memory consumption, resulting in database persistence and stability.
1. Two important storage views
The journaling feature has two important memory views: Private view and shared view. These two memory views are implemented through MMAP (memory-mapped), where the mapped memory modifications to private view do not affect the disk, and changes in the data in shared view affect the files on the disk, and the system periodically refreshes the data in the shared view to disk.
(1) shared view during MongoDB startup, the operating system maps the data files on the disk to shared view in memory, the operating system simply completes the mapping, and does not immediately load the data into memory, and MongoDB loads the data to the GKFX as needed View
(2) Private view memory view is the location where data is saved for a read operation, and is the first place for MongoDB to save a new write operation.
(3) The journaling log file on the disk is the place where write persistence is saved. This file is read when the MongoDB instance is started.
2. How Journaling Works
When Mongod is started, the data file is first mapped to the shared view, and if the size of the data file is 4,000 bytes, it maps the size of the data file into memory with an address of 1000000-1004000. If we read the memory with the address 100060 directly, we will get the contents of the 60th byte in the data file. One thing to note is that this simply completes the memory mapping of the data file, does not load all the files into memory, only the memory that is loaded with the corresponding file content when it is read to an address, equivalent to loading on demand. As shown in the following:
Mongod Memory Mapping at startup
When a write or modify operation occurs, the process first modifies the in-memory data, at which point the file data on the disk is inconsistent with the data in memory. If the journaling feature is not turned on at mongod startup, the operating system refreshes the shared view for the changed data in memory every 60 seconds and writes it to disk. If the journaling log feature is turned on, Mongod will generate an extra private view, and MongoDB will synchronize the private view with the shared view, as shown in:
Shared view is synchronized with private view
When the write operation occurs, MongoDB first writes the data to the private view in memory, noting that the private view does not directly connect to the file on the disk, so that the operating system does not flush the changes to disk as shown here:
MongoDB writes data to private view
Then MongoDB bulk copy of the write operation to Journal,journal will store the write to the file on disk so that it persists, and each entry on the journal log file describes which bytes on the data file are changed by the write operation, as shown in:
Writes changes to the data file into the journal log file
Because of changes in the data file (such as where the data becomes) is persisted to the journal log file, even if the MongoDB server crashes, the write operation is also secure. Because when the database restarts, the journal log file is read first, and the changes caused by the write operation are resynchronized to the data file.
When the above steps are completed, MongoDB will then update the data in the shared view using the data file changes caused by the write operation record in the journal log. As shown in the following:
Refresh shared View
When all the changes are updated to the shared view, MongoDB will reuse the shared view to map the private view, in a way that the private view becomes "too dirty" so that its occupied memory space is restored to its original value. About 0. The data in shared view memory becomes inconsistent with the data on the disk. By default of 60 seconds, MongoDB periodically requires the operating system to flush data from the shared view to disk, keeping the data on the disk consistent with the in-memory data, as shown in:
Resynchronize private view and flush to disk
When the data in memory changes to disk is executed, MongoDB deletes all writes after that point in the journal, similar to the checkpoint in the relational database. Finally, MongoDB will resynchronize the shared view with private view, maintaining consistency.
MongoDB's journaling log function, After the 2.0 version is started by default, you can control the startup option when the instance Mongod is started, and one of the steps mentioned above is to write the write operation periodically into the journal log file, the size of which is controlled by the optional start parameter journalcommitinterval, the default The value is 100ms. MongoDB refreshes memory-changing data to disk by 60s cycles, which is controlled by starting the optional parameter syncdelay. These default values generally apply in most cases and do not change easily. Through the above analysis, the database server still has the risk of 100ms loss of data, because the journaling log write to the disk cycle is 100ms, if just a batch of write is still in memory, did not have time to brush to journaling on the disk corresponding file, the server suddenly failed, These write operations in memory are lost.
When MongoDB starts, it initializes a thread that is constantly looping to fetch the data to be persisted from the defer queue for a certain period of time and write to the disk's journal (log) and Mongofile (data), of course, because it is not written to disk when the user adds a record , so according to MongoDB developers, it does not cause performance loss, as it is seen in code discovery that, when performing cud operations, records (record types) are put into the defer queue for deferred batch (group commit) commit writes.
anyway MongoDB using memory-mapped techniques to accomplish these functions requires a reference Unix memory mapping in environment programming MMAP , the file IO and other programming knowledge.
Journaling is a very important feature in MongoDB, similar to the transaction log in a relational database. Journaling enables a database to recover quickly due to other unexpected cause failures.
MongoDB Combat Guide (iv): MongoDB's journaling log function