MongoDB source code overview-Log

Source: Internet
Author: User

This article mainly introduces the MongoDB Log Module and the code implementation method of the data persistence storage module. You may be surprised why the Log Module and persistent storage module will be summarized in an article. Hey, in other systems, the two modules may not be very closely linked, but the MongoDB modules cannot be separated. What's going on? Please let me know...

Generally, MongoDB has three log modules,

  • Log
  • Journal
  • Oplog

Log: Located in log. h. It is mainly responsible for user log files, which is no different from the log system of our common system. The function is to record some important processes of the system and then persist them to the log files. You can use the system startup parameter "-- logpath" for this log file ".

Journal: Located in Dur. H. This module function is enabled by the startup parameter "-- dur. It is mainly used to solve the data loss caused by the system downtime and the data in the memory is not written to the disk. (why is the data stored in the memory for storage instead of directly storing external files? This is related to the MongoDB storage mechanism, which will be discussed later ). The Mechanism is to regularly record operation logs (operations on Database changes, queries are not within the record range) to the folder named journal in dbpath, in this way, when the system restarts again, the lost data will be restored from this folder.

Oplog: when deploying a robust server for production, you need to perform synchronous backup on the server. MongoDB proposes the replica set mode to solve this problem, the role of oplog is mainly to record the write server (only one server in a replica set can write, and multiple backup servers can read) all the changes to the data (queries and other operations that do not produce changes to the database will not be recorded). In this way, the servers of other read extensions in the replica set (that is, the machines used for backup and the servers with distributed read pressure) can synchronize the differences by obtaining the oplog.

 

This article mainly introduces logs and persistent storage, as well as their relationships. Therefore, this article does not describe oplog too much. When talking about the replica set Module in subsequent articles, I will definitely write it. The main focus of this article is to analyze Journal and the implementation of persistence. Therefore, for the Log Module, I will simply summarize it.

 

Log Module:

When we start MongoDB, the calling process for the Log Module is as follows:

   

Main (...)-> addwindowsoptions (...)-> initlogging (...)-> loggingmanager. Start (...);

This code will be called later to set stdout output targets

File * TMP = freopen (_ path. c_str (), (_ append? "A": "W"), stdout );

Because the static logfile Pointer Points to stdout

File * logstream: logfile = stdout;

Therefore, the final data in logstream is flushed to stdout, that is, the destination specified by the system.

 

In log. H, it is defined as follows:

   

1 Enum loglevel {ll_debug, ll_info, ll_notice, ll_warning, ll_error, ll_severe };
2 inline nullstream & log (INT level ){
3 if (level> loglevel)
4 return nullstream;
5 return logstream: Get (). Prolog ();
6}
7
8 inline nullstream & log (){
9 return logstream: Get (). Prolog ();
10}

 

Because logstream reloads some basic stream symbols:

Logstream & operator <(const char * X) {SS <X; return * This ;}

Logstream & operator <(const string & X) {SS <X; return * This ;}

Logstream & operator <(const stringdata & X) {SS <X. Data (); return * This ;}

Logstream & operator <(char * X) {SS <X; return * This ;}

...

Logstream & operator <(ostream & (* _ Endl) (ostream &)){

SS <'\ n ';

Flush (0 );

Return * this;

}

Logstream & operator <(ios_base & (* _ HEX) (ios_base &)){

SS <_ hex;

Return * this;

}

 

Therefore, we can easily use the following operations to record our logs.

Log () <"Warning: alloc () failed after allocating new extent. lenwhdr:" <Endl;

I don't know if you like my analysis model. I like it very much. It's fast-paced, straightforward, and reliable!

Here is a brief description <the logs to be output by the reload operator method are placed in stringstream, and then called <Endl, trigger the logstream & operator <(ostream & (* _ Endl) (ostream &) method listed above, and indirectly call flush (0)

The void logstream: flush (tee * t) method is used to persist all logs cached in stringstream to logfile. because the flush code is also very concise and easy to understand, we will not post it here. So far, user logs have been written to external storage, and basic functions have been completed.

In flush, the tee-related code is also worth noting.

If (t) T-> write (loglevel, out );

If (globaltees ){

For (unsigned I = 0; I <globaltees-> size (); I ++)

(* Globaltees) [I]-> write (loglevel, out );

}

Tee definition:

Class tee {

Public:

Virtual ~ Tee (){}

Virtual void write (loglevel level, const string & Str) = 0;

};

Therefore, we can clearly understand that, in fact, tee is responsible for subscribing to log information (observer design mode ), the derived classes of any tee can be used to record additional logs to any other place without affecting existing logs. For example, remote logs, or in many cases on the server, collect user logs of each server and put them into the database for the Administrator to view.

 

Journal module:

In fact, in MongoDB, journal \ durability is a large module that involves a lot of things. His original design was to use logs to improve the reliability of standalone data, this section appears for the first time on the latest branch of version 1.7. his responsibilities can be summarized in one sentence:

Periodically record operation logs (operations that change the database, queries that are not within the record range) to the folder named journal in dbpath, in this way, when the system restarts again, the lost data will be restored from this folder.

Based on its functions, we can summarize the implementation of this part into the following problems:

  • When to call
  • How to record user operations
  • How to serialize user operations and make them persistent
  • How to recover data based on existing journal logs

Next we will analyze the important steps based on the source code:

 

I. When to call

When we need to change the database, we need to record the user's operations and the user's changed data. The recorded data will be the data source for recovery. For example, when we insert a record into the database, we need to record user operations and operation data. I have intercepted this part of code, as shown below:

R = (record *) getdur (). writingptr (R, lenwhdr); // persistently insert record information

...

If (obuf)

Memcpy (R-> data, obuf, Len); // directly copy the data to the record Field

Let's take a look at several important methods in durableimpl:

// Tell the system that I am writing data to the X position (it will be called if it is changed or inserted)

Void * writingptr (void * X, unsigned Len );

// Tell the system that I have created a file

Void createdfile (string filename, unsigned long Len );

In fact, the subline of calling the above two functions is: What have I done now? You should record it completely for me. If I have not been saved to the disk, I need your recording module to show the original records. I will perform the recovery operation to ensure that the data is safe!

 

2. How to record user operations

In this module, user operation types can be classified into two types: Basic write operations and non-basic write operations. Data addition and modification can be considered as write operations. operations similar to filecreatedop and dropdbop are non-basic write operations, this type of operation is modeled as durop. these two operations will be stored in the memory in commitjob: Note () and commitjob: noteop.

The basic write operation will be encapsulated by the D struct. Let's look at its structure:

  

Struct d {

Void * P; // The first address of the data source changed by the user

Unsigned Len; // The length of the data modified by the user

Static void go (const D & D );

};

 

The basic write record is stored in the writes class on the instance _ wi of the commitjob class, and then stored in the taskqueue <D> instance _ deferred of the writes class.

Non-Basic write records are stored in the writes class vector <shared_ptr <durop >>_ops;

There are many classes involved in this process. The following two sequence diagrams are used to summarize this process.

Call the Sequence Graph of getdur (). writingptr

Call the Sequence Graph of getdur (). createdfile

So far, user operation logs are recorded in the memory. Next we will talk about how to serialize records in the memory for persistence to the disk.

 

So far today, the following two points will be written together with other articles next time.

  • How to serialize user operations and make them persistent
  • How to recover data based on existing journal logs

  • There will be a period later ......

    Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.