Go deep into MongoDB (1)-mongod thread model and network framework

Source: Internet
Author: User

Recently, I need to start studying MongoDB. I am going to look at its source code, provide an essential understanding of mongod and mongos service architectures, sharding policies, replicaset policies, data synchronization and Disaster Tolerance, and indexing mechanisms. The code is about 0.2 million lines (I am studying the source code of version 2.0.6). This article first starts with the Startup Process of mongod. It is a multi-threaded program, so this article describes how many threads of mongod, the meaning of each thread. I hope you will focus on the peripheral framework of mongod when reading this article. It does not involve the organization of data files or the organization of index B tree, and only focuses on the network framework and thread model.

The benefit of figuring out this is obvious: You can then study exactly how a module of mongod is implemented, and you can quickly jump to the corresponding class to read the source code, solve our actual problems in the product. I think this is a good start to study its huge source code.

Before explaining mongod, you must understand that a large amount of MongoDB code is built based on the boost library. Therefore, we should first give a simple understanding of the thread used to create the boost library.

1. How to Create a thread in the boost Library

Boost: thread is a cross-platform multi-threaded library in boost. When MongoDB creates a thread, it uses the thread library in most cases (in a few cases, it directly calls the pthread_create method ), the following two methods are used:

(1) Run it directly to allow the thread to run func

For example, the durthread thread:

Void durthread (){

While (! Inshutdown ()){...}

}

Boost: thread t (durthread );

(2) define a static run method in the class and call the thread to create a thread.

Class fileallocator: boost: noncopyable {
Static void run (fileallocator * Fa );

Void fileallocator: Start (){
Boost: thread t (boost: BIND (& fileallocator: Run, this ));
}
};

2. Entrance to mongod

The main function of mongod entry is in the src/Mongo/db. cpp file. I drew a simple activity diagram to briefly introduce the startup process:

As shown in, there are 12 fixed threads, and there are no threads generated when mongod processes requests after running, as shown below:

-Interruptthread

-Datafilesync: Run

-Fileallocator: Run

-Durthread

-Snapshotthread: Run

-Clientcursormonitor: Run

-Periodictask: Runner: Run

-Ttlmonitor: Run

-Replslavethread

-Replmasterthread

-Webserverthread

-Main thread for processing database requests

If it does not belong to any replica set, there are at least 10 fixed threads (remove replslavethread and replmasterthread ).

Next we will first discuss these 10 fixed threads, and then discuss how the threads that listen to web events with very weak performance process requests, finally, we discuss how the main service thread with better performance processes requests.

3. Five worker threads implemented based on the backgroundjob class

The five threads are datafilesync, snapshotthread, clientcursormonitor, ttlmonitor, and periodictask. The class diagram is as follows:

The above five classes also use the boost: threadfunction method to create threads for running. They inherit the backgroundjob class. To run jobbody in the startup thread, run the run method in the go method, as follows:

    BackgroundJob& BackgroundJob::go() {        boost::thread t( boost::bind( &BackgroundJob::jobBody , this, _status ) );        return *this;    }    void BackgroundJob::jobBody( boost::shared_ptr<JobStatus> status ) {        ...        run();        ...    }

The meanings of these threads are as follows:

Datafilesync is mainly used to call the memorymappedfile: flush method to flush data in the memory to the disk. We know that MongoDB calls MMAP to map data in the disk to the memory, so there must be a mechanism to fl data to the hard disk at any time to ensure reliability, the duration of refreshing is related to the syncdelay parameter.

Snapshotthread will generate a snapshot file to help you quickly recover.

Clientcursormonitor manages users' cursors and calls the idletimereport () method every four seconds. The saymemorystatus () method is called every minute.

Ttlmonitor manages TTL and checks all databases by calling the dottlfordb () method.

Periodictask obtains periodic task execution from the dynamic array STD: vector <periodictask *> _ tasks.

4. Five threads that directly provide global method execution

Fileallocator is used to allocate new files. It determines the size of the allocated files, for example, doubling.

Interruptthread only processes semaphores.

Durthread is used for batch submission and rollback.

Replslavethread is the synchronization thread when the current node is used as the secondary.

Replmasterthread is the synchronization thread when the current node acts as the master node.

5. Web listening thread

How does mongod process Web requests? It is implemented through the core class listerner in the network framework. The class diagram is as follows:

How to understand this class chart?

First, let's look at the listener class, which is responsible for listening and creating new connections. The procedure is as follows:

A. Create a socket handle, bind the port, and listen

B. Call select to detect new Connection events

C. Call accept to create a new connection for the detected event

D. Call the void listener: acceptedmp (messagingport * MP) method to process the new connection. who re-implements the acceptedmp method and who decides the processing method?

This listener class is used to process both Web requests and common database requests.

OK. Now let's see how the Web request is handled. The mini-webserver class inherits the listener class, implements the acceptedmp method again, receives TCP streams, parses the HTTP protocol, and assembles the HTTP response packet and sends the TCP to the client. Who is the class that actually completes the HTTP request? It inherits the dbwebserver class of the miniwebserver class. This class re-implements the dorequest method, which will be called after the complete receipt of the HTTP request. The processing of the HTTP request is not covered in this article, which is skipped here. But we know that this thread uses synchronous blocking to process requests. It means that it can only process one Web request at a time, and its concurrency is extremely weak. Fortunately, Web requests are only a sideline of mongod, it is only used to query the status.

6. Main listening thread and Data Request Processing thread

The portmessageserver class in which the database request is processed runs in the main thread.

Let's first look at how the portmessageserver class implements the acceptedmp method:

virtual voidacceptedMP(MessagingPort * p) {if ( !connTicketHolder.tryAcquire() ) {sleepmillis(2); // otherwisewe'll hard loopreturn;} …int failed =pthread_create(&thread, &attrs, (void*(*)(void*)) &pms::threadRun,p);…}

It is clear that it enables a thread to execute this request independently. Although this method still has poor performance: a large number of context switches are waiting for us, but it is always much better than Web request processing, and the concurrency capability of mongod is not its long term.

For each new connection, classes are encapsulated into objects as follows:

Then PMS: The threadrun method is processing the messagingport object.

Let's take a look at PMS: what is done in the threadrun method:

void threadRun( MessagingPort *inPort) {TicketHolderReleaserconnTicketReleaser( &connTicketHolder );Message m;try {LastError * le = newLastError();lastError.reset( le ); //lastError now has ownershiphandler->connected( p.get());while ( ! inShutdown() ) {if ( ! p->recv(m) ) {p->shutdown();break;}handler->process( m ,p.get() , le );}}handler->disconnected( p.get());}

It can be seen that it will receive the complete request on this connection, and then call the handler process method. What is this handler? As shown in:

Therefore, common database requests are processed by mymessagehandler's process method. This method is only an encapsulation. The global method assumeresponse is used to process the business.

The assumeresponse method calls the methods in datafilemgr to process the actual files according to the eight operation methods respectively. For example:

enum Operations {opReply = 1,     /* reply. responseTo is set. */dbMsg = 1000,    /* generic msg command followed by a string */dbUpdate = 2001, /* update object */dbInsert = 2002,//dbGetByOID = 2003,dbQuery = 2004,dbGetMore = 2005,dbDelete = 2006,dbKillCursors = 2007};

There is code similar to this in the method to call the actual business class processing operations:

                else if ( op == dbInsert ) {                    receivedInsert(m, currentOp);                }                else if ( op == dbUpdate ) {                    receivedUpdate(m, currentOp);                }                else if ( op == dbDelete ) {                    receivedDelete(m, currentOp);                }

Of course, this article is not here. Next we will discuss operations on indexes and data files.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.