Building MongoDB Shards

Source: Internet
Author: User
Tags mongodb sharding

Building MongoDB Shards

http://gong1208.iteye.com/blog/1622078

Sharding Shard Concept

It is a kind of database cluster system that expands the massive data horizontally, the data table is stored on each node of sharding, and the user can easily construct a distributed MongoDB cluster by simple configuration.

The data chunking of MongoDB is called Chunk. Each chunk is a contiguous data record in the Collection, usually with a maximum size of 200MB, and a new chunk is generated beyond that.

To build a MongoDB sharding Cluster, you need three roles:

Shard Server

That is, the shards that store the actual data, each shard can be an Mongod instance, or a set of replica set that consists of Mongod instances. In order to implement each shard internal Auto-failover,mongodb the official recommendation for each shard is a set of replica set. For more information on how to install and build replica set please refer to my other article http://gong1208.iteye.com/blog/1558355

Config Server

In order to store a particular collection in multiple Shard, you need to specify a shard key for the collection, for example {age:1}, and Shard key determines which chunk the record belongs to. Config servers is used to store the configuration information for all shard nodes, the Shard key range for each chunk, chunk distribution in each Shard, and collection configuration information for all DB and sharding in the cluster.

Route Process

This is a front-end route that the client accesses, then asks config servers which shard to query or save the record, and then connect the corresponding shard to operate, and finally return the results to the client. The client simply sends the query or update request that was originally sent to mongod to routing Process without worrying about which shard the record is stored on.

Below we build a simple sharding Cluster on the same physical machine:

The architecture diagram is as follows:

    • Shard Server 1:27,017
    • Shard Server 2:27,018
    • Config server:27027
    • Route process:40000

Implementation steps

Step One:

Start Shard Server

Mkdir-p/OPT/MONGODB/DATA/SHARD/S0--Create data Directory

Mkdir-p/opt/mongodb/data/shard/s1

Mkdir-p/opt/mongodb/data/shard/log--Create log directory

/opt/mongodb/bin/mongod--port 27017--dbpath/opt/mongodb/data/shard/s0--fork--logpath/opt/mongodb/data/shard/log /s0.log--Start Shard Server Instance 1

/opt/mongodb/bin/mongod--port 27018--dbpath/opt/mongodb/data/shard/s1--fork--logpath/opt/mongodb/data/shard/log /s1.log--Start Shard Server Instance 2

Step Two:

Start Config Server

Mkdir-p/opt/mongodb/data/shard/config--Create data Directory

/opt/mongodb/bin/mongod--port 27027–dbpath/opt/mongodb/data/shard/config--fork--logpath/opt/mongodb/data/sha Rd/log/config.log--Start the config server instance

(Note that here we can boot up like a normal mongodb service, without adding the-SHARDSVR and Configsvr parameters.) Because the function of these two parameters is to change the boot port, so we specify the port on our own)

Step Three:

Start Route Process

/opt/mongodb/bin/mongos--port 40000--configdb localhost:27027--fork--logpath

/opt/mongodb/data/shard/log/route.log--chunksize 1--Start a route server instance

MONGOs startup parameters, chunksize This is used to specify the size of the chunk, the unit is MB, the default size is 200MB, in order to facilitate testing sharding effect, we specify Chunksize as 1MB. This means that data transfer begins when the data inserted in this shard is greater than 1M

Step Four:

Configure Sharding

Next, we use the MongoDB shell to log in to MONGOs and add the Shard node

[[email protected] ~]#/opt/mongo/bin/mongo admin--port 40000--this operation requires connection to the Admin library

MongoDB Shell version:2.0.1

Connecting To:127.0.0.1:40000/admin

> Db.runcommand ({addshard: "localhost:27017"})--Add Shard Server

{"shardadded": "shard0000", "OK": 1}

> Db.runcommand ({addshard: "localhost:27018"})

{"shardadded": "shard0001", "OK": 1}

> Db.runcommand ({enablesharding: "Test"})--Set the database for the Shard store

{"OK": 1}

> Db.runcommand ({shardcollection: "Test.users", key: {id:1}})--Sets the collection name of the Shard. And you must specify Shard Key, the system will automatically create an index

{"collectionsharded": "Test.users", "OK": 1}

Note here we have to pay attention to the selection of the chip key, select the key needs according to the specific business data form to choose, must not choose arbitrarily, in fact, in particular, do not choose self-increment _id as a key, unless you are very clear about the purpose of this, specific reasons I do not analyze, according to experience recommend a more reasonable chip key way Self-increment field + query field "Yes, the tablet key can be a combination of multiple fields.

It is also explained here that the basic mechanism of sharding is that shards always attempt to distribute existing data across all shards. For example, now there are two shards, I have chosen the ID as the chip key, assuming that the ID is self-increment, such as 1--10000, then the result of the Shard is evenly divided, that is, 1--5000 in the chip a,5000--10000 in piece B, of course, not necessarily so accurate, but it is guaranteed to be as average as possible, And so on, if there are three shards, equally evenly divided.

It is also necessary to note that when the data is inserted at the beginning, the data is inserted only on one piece of the Shard, after inserting, the inside of MongoDB begins to move the data between the slices, this process may not be immediate, and MongoDB is smart enough to decide whether to move immediately or later depending on the current load.

Immediately after inserting the data, execute the db.users.stats (); two times can be verified as mentioned above.

Ok, the simple Shard is so set up, connect the MONGOs, and then start inserting the data to verify it.

Building MongoDB Shards

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.