MongoDB auto-sharding (Automatic sharding) Introduction

Last Update:2015-08-20 Source: Internet

Author: User

Tags failover mongodb sharding mongo shell

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MongoDB is a document-oriented NoSQL database developed by the 10gen team. More than a year ago, MongoDB has been more and more large-scale website application to the production environment, the more famous have Foursquare, bit.ly, SourceForge, boxed and so on. MongoDB provides auto-sharding functionality that allows users to easily build a distributed MongoDB cluster with a simple configuration.

MongoDB's auto-sharding can do:

· When the load and data distribution is unbalanced between sharding, the automatic rebalancing

· Easy and easy to add and remove nodes

· Automatic failover (auto failover)

· Expandable to thousands of nodes

A MongoDB sharding consists of three parts:

1. Shards

Shard is the Shard that stores the actual data, each shard can be a mongod instance, or it can be a set of replica set of Mongod instances. In order to implement each shard internal Auto-failover,mongodb the official recommendation for each shard is a set of replica set.

2. Config Servers

In order to split a collection into multiple chunk, stored in multiple shard, you need to specify a shard key for the collection. For example {name:1}, {_id:1}, {lastname:1, firstname:1}, and so on. Shard Key determines which chunk the record belongs to, for example, when 1 < shard Key < 100 is a chunk, the chunk is saved on shard1. Config servers is used to store: configuration information for all shard nodes, shard key range for each chunk, chunk distribution in each Shard, collection configuration for all DB and sharding in the cluster.

3. Routing Process

MongoDB's binary package has a MONGOs program that is used to make the routing process for the MongoDB cluster. It is equivalent to a transparent proxy, receives a query or update request from the client, and then asks Config servers which shard to query or save the record, and then connect the corresponding shard to do the operation, and finally return the results to the client. The client simply sends the query or update request that was originally sent to mongod to routing Process without worrying about which shard the record is stored on.

Next I'll show you how to build a simple MongoDB cluster to test MongoDB's auto-sharding functionality.

This MongoDB cluster will contain two shards, a config server and a routing Process. We will use MongoDB 1.6.5来 To do this test, for: http://www.mongodb.org/downloads

First, we create a data directory for two shards and one config server:

sudo mkdir-p/data0/mongo/shard1/data0/mongo/shard2/data0/mongo/config

Then we start with two mongod processes in turn as Shard, a mongod process as Config Server, and a mongos process as routing processes:

sudo mongod--port 27017--fork--logpath/var/log/mongo_shard1.log--dbpath/data0/mongo/shard1--shardsvr

sudo mongod--port 27018--fork--logpath/var/log/mongo_shard2.log--dbpath/data0/mongo/shard2--shardsvr

sudo mongod--port 27217--fork--logpath/var/log/mongo_config.log--dbpath/data0/mongo/config--configsvr

sudo mongos--port 27417--fork--logpath/var/log/mongos.log--configdb 127.0.0.1:27217--chunksize 1

MONGOs startup parameters, chunksize This is used to specify the size of the chunk, the unit is MB, the default size is 200MB, in order to facilitate testing sharding effect, we specify Chunksize as 1MB.

Next, we use the MONGO shell to log in to MONGOs and add the Shard node:

MONGO--port 27417

MongoDB Shell version:1.6.5

Connecting To:127.0.0.1:27417/test

> Use admin;

Switched to DB admin

> Db.runcommand ({addshard: "127.0.0.1:27017"})

{"shardadded": "shard0000", "OK": 1}

> Db.runcommand ({addshard: "127.0.0.1:27018"})

{"shardadded": "shard0001", "OK": 1}

Here we enable sharding for database "foo" and set the Shard key of Collection "col" to "{_id:1}" to test the sharding function:

> Db.runcommand ({enablesharding: ' foo '});

{"OK": 1}

> Db.runcommand ({shardcollection: "Foo.col", Key:{_id:1}});

{"collectionsharded": "Foo.col", "OK": 1}

In order to test the balance effect of sharding, I have inserted about 200M of data in succession, using db.stats () to query the distribution of data during the insertion process. It was found that all trunks were stored on shard0000 when the amount of data was smaller, but when they continued to be inserted, the data began to be evenly distributed, and MONGOs rebalance the data between multiple shard. When the insertion data reaches 200M, at the end of the insertion, there is about 135M data on the shard0000, and about 65M data on the shard0001, but after a while, the amount of data on the shard0000 is reduced to 115M, The amount of data on the shard0001 has reached 85M.

MongoDB auto-sharding function Since the beginning of the 1.6 version of the Production-ready, so far more than half a year, most companies are still watching, do not dare to use the production environment, so at present there is not much relevant information on the Internet can be consulted. In the future, we will continue to share more experience in MongoDB use process.

MongoDB auto-sharding (Automatic sharding) Introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More