MongoDB data sharding implementation

Source: Internet
Author: User

Reprinted from http://www.cnblogs.com/spnt/

Replica sets Enable Secure Backup of websites and seamless failover of faults, but do not enable massive data storage. After all, physical hardware has limits. distributed deployment is required at this time, save the data to another machine. MongoDB's sharding technology perfectly meets this requirement.

Understand MongoDB's sharding technology, that is, sharding Architecture

What is sharding? To put it bluntly, it refers to the cluster system for horizontal scaling of massive data, and the data sub-tables are stored on each sharding node.

MongoDB data is divided into chunks. Each Chunk is a continuous data record in the collection. Generally, it is 200 MB. If it exceeds the limit, a new data block is generated.

Three roles are required to build a sharding,

Shard server: a shard Server is a shard that stores actual data. Each shard can be a mongod instance or a set of mongod instances, to achieve automatic failover within each shard, MongoDB officially recommends that each shard be a set of replica sets.

Config server: to store a specific collection in multiple Shard, You need to specify a shard key for the collection and decide which chunk the record belongs, the configuration server can store the following information:

1. configuration information of all shard nodes

2. Shard key range of each chunk

3. Distribution of chunks in each shard

4. Shard configuration information for all databases and collections in the Cluster

Route process: a front-end route from which the client accesses it. First, ask the configuration server to query or save the records on that shard, and then connect the corresponding Shard to perform the operation, finally, return the result to the client. The client only needs to send the original query Live Update request sent to mongod to the router process without worrying about the shard where the operation records are stored.

 

Build sharding

From the above analysis, we can conclude that building a sharding requires at least four MongoDB processes, two shard servers (for sharding), one config server, and one route process, and then arrange the following

Process port file directory

Shard Server 1 2000 mongodb5

Shard Server 2 2001 mongodb6

Config server 30000 mongodb7

Route process 40000 mongodb8

Configure now:

1. Start shard Server

 

 

There is only one more command to start: shardsvr. With this command, the process is a shard process.

2. Start config Server

The configsvr command is used to start config server.

3. Start route Process

The chunk size is set to 1 MB to facilitate the test of the partition effect.

4. Configure sharding

After all the processes are started, the rest is to concatenate them into strings.

Open a new CMD and connect to the vro process. Use addshard to add it to the vro.

 

Through the above two operations, the entire architecture has been stringed, but don't worry, the architecture does not know the sharding database and the partition key yet.

Specify that the sharded database is friends, and then specify the sharding according to the _ ID in the frienduserattach table.

So far, the entire system has been configured.

Verify the sharding. I use the program to insert the data. Because the table is actually used by me, it is too troublesome to insert it in cmd. Here I use the client driver to insert 10000 pieces of data.

 

Use the use command to switch to the friends database, and then stats to view the current status

Field Description: sharded is true, indicating that the table is sharded.

The shards part has two shard servers: "shard0000" and "shard0001 ". The "shard0000" field count is 1016, indicating that the amount of data distributed on the shard server is 1016. Size indicates the size of the database distributed on the shard server, in B.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.