MongoDB provides the auto-sharding function. Because it is auto-sharding, mongodb uses mongos (an automatic sharding module used to build a large-scale scalable database cluster, which can be incorporated into dynamically increasing machines) A horizontally scalable Database Cluster System is automatically created to store database sub-tables on each sharding node.
A mongodb cluster includes some shards (including some mongod processes), mongos routing processes, and one or more config servers.
The following are some vocabulary descriptions:
Shards: each shard includes one or more services and mongod processes that store data (mongod is the core process of MongoDB data). Typically, each shard enables multiple services to improve service availability. These service/mongod processes form a replica set in the shard.
Chunks: a Chunk is a data range from a special set. (collection, minKey, maxKey) describes a chunk, which is between minKey and maxKey. For example, the maxsize of chunks is 100 mb. If a file reaches or exceeds this range, it is split into two new chunks. When a shard has an excessive amount of data, chunks will be migrated to other shards. Similarly, chunks can be migrated to other shards.
Config Servers: The Config server stores the metadata information of the cluster, including the basic information and chunk information of each server and shard. The Config server stores the chunk information. Each config server copies the complete chunk information.
The source code to be introduced today is mainly the execution process of Mongos's main entry function. First, we open the Mongos Project (you can open the source code dbdb_10.sln to load all projects), such:
Note: To debug mongos, you must set a mongod process and a Config Server, as shown in the following figure:
D: mongodb> bin> mongod -- dbpath d: mongodbdb -- port 27012
D: mongodb> bin> mongod -- configsvr -- dbpath d: mongodbdb -- port 27022
Then configure the corresponding boost path and startup parameter information in vs2010, such:
Start the text below. First open the server. cpp file in the mongos project and find the following method:
Int main (int argc, char * argv []) {
Try {
Return _ main (argc, argv );
}
Catch (DBException & e ){
Cout <"uncaught exception in mongos main:" <endl;
Cout <e. toString () <endl;
}
Catch (std: exception & e ){
Cout <"uncaught exception in mongos main:" <endl;
Cout <e. what () <endl;
}
Catch (...){
Cout <"uncaught exception in mongos main" <endl;
}
Return 20;
}
This method is the main function of mongos, and the code is very simple. It mainly executes the _ main method in try mode. below is the execution process of _ main:
Int _ main (int argc, char * argv []) {
Static StaticObserver staticObserver;
Repeated scommand = argv [0];
// Declare the options Information Description object
Po: options_description options ("General options ");
Po: options_description sharding_options ("Sharding options ");
Po: options_description hidden ("Hidden options ");
Po: positional_options_description positional;
Using line: addGlobalOptions (options, hidden );
// Add the sharding option description
Sharding_options.add_options ()
("Configdb", po: value <string> (), "1 or 3 comma separated config servers ")
("Test", "just run unit tests ")
("Upgrade", "upgrade meta data version ")
("ChunkSize", po: value <int> (), "maximum amount of data per chunk ")
("Ipv6", "enable IPv6 support (disabled by default )")
("Jsonp", "allow JSONP access via http (has security implications )")
;
Options. add (sharding_options );
.....
After the initialization of the option description is completed, the following analysis and execution of the startup command line parameters are started:
.....
// Parse options
Po: variables_map params;
// Argc and argv are analyzed and converted to params for the following use
If (! Using line: store (argc, argv, options, hidden, positional, params ))
Return 0;
// The default value may vary depending on compile options, but for mongos
// We want durability to be disabled.
Export line. dur = false;
// For help
If (params. count ("help ")){
Cout <options <endl;
Return 0;
}
// For version information
If (params. count ("version ")){
PrintShardingVersionInfo ();
Return 0;
}
// To set chunkSize
If (params. count ("chunkSize ")){
Chunk: MaxChunkSize = params ["chunkSize"]. as <int> () x 1024*1024;
}
......
// Required. Set configdb information.
If (! Params. count ("configdb ")){
Out () <"error: no args for -- configdb" <endl;
Return 4;
}
Vector <string> configdbs;
// Split the configdb parameter ()
SplitStringDelim (params ["configdb"]. as <string> (), & configdbs ,,);
// Mongodb must be 1 or 3 for unknown reasons
If (configdbs. size ()! = 1 & configdbs. size ()! = 3 ){
Out () <"need either 1 or 3 configdbs" <endl;
Return 5;
}
// We either have a seeting were all process are in localhost or none is
For (vector <string >:: const_iterator it = configdbs. begin (); it! = Configdbs. end (); ++ it ){
Try {
// Instantiate the HostAndPort object based on the address parameter. if the address is invalid, an exception is thrown.
HostAndPort configAddr (* it );
If (it = configdbs. begin ()){
Grid. setAllowLocalHost (configAddr. isLocalHost ());