MongoDB: sharding (Introduction & Auto Shard & Tablet key)

Last Update:2017-02-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sharding (increased server, horizontal scaling) is the way MongoDB expands, with shards that can add more machines to cope with increasing load and data without impacting applications.

Introduction

Sharding (sharding) refers to the process of splitting data and dispersing it across different machines. In a relational database, when a table is too large (more than hundreds of millions of rows of data), we also have a sub-table approach, and here The Shard is similar to the concept.

Manual sharding: When we apply a bottleneck to the database system, if we are using a relational database, we usually do manual sharding. That is, we use our application layer code to maintain a connection to several database systems, and each connection is independent. Our application layer code is responsible for shielding the underlying multiple database instances, query-oriented to a specific instance, this way has a disadvantage, is to maintain too much trouble! For example, to add or remove nodes to the underlying database cluster, adjust the data distribution and load patterns, and so on, we have this layer of application layer code to put forward a small challenge!

MongoDB is designed to consider the scale-out, which supports automatic sharding! We can easily add or remove machines to a MONGODB database server cluster, and the cluster will automatically slice the data and load balance!

"Auto Shard"

A "slice" is a standalone MongoDB service (that is, a mongod service process, in a development test environment) or a replica set (in a production environment). The idea of slicing data is to split a large set into a small part and place it on a different "slice". Each "slice" is just a part of the total data. Automatic sharding is: The application layer does not know that the data has been fragmented, and will not know exactly which data on which specific "slice". In MongoDB, a routing service MONGOs is provided, which needs to be run before sharding, which specifically knows the relationship between data and "slices". The application communicates with the routing service, and the routing service forwards the request to a specific "slice", and the route collects the response data and returns it to the application-tier program. The following two images show the processing path of a user sending a request without using shards and using shards:

Before the Shard:

After fragmentation:

So when are we supposed to give our old system () improved to a new system after sharding (), usually with the following principles:

1 "The disk of the machine is not enough, the amount of data is too large

2 "single Mongod has been unable to meet the performance requirements of writing data (here to review, if you want to increase read performance, a better solution is to build a master-slave structure, and let the slave node can respond to query requests)

3 "To put a lot of data into memory to improve performance, the memory size of a machine always has a limit (this is the difference between vertical scaling and scale-out)

"Tablet key"

When you set up a shard, you select a key from the collection that is used as the basis for splitting the data. This key is called the "slice Key". We can provide a simple example, for a collection of information about a store users, we want to shard it, the selected slice key is the person name name, then the result of the last Shard may be: The first piece of the person stored in the name is A-f, the second piece is the beginning of g-p, In the third chapter, Q-z begins. When the user submits the query is: Db.users.find ({"Name": "Jimmy"}), the query request is assigned to the second slice for processing, when the user submits the query is: Db.users.find ({"name": {"$lt": "J"}}, The query request is assigned to the first and second slices for processing, and when the user submits a query that does not contain the slice key information, the query is sent to all slices for processing. For an insert operation, the routing service sends the request to a specific slice based on the value of the key name of the inserted document. This is the role of the chip key.

With the increase or decrease of the data, there may be a load of a large, another load easy situation, for this case, MongoDB will automatically balance the data and load, is the final flow of each piece is basically the same!

for which key to select as the slice key? There is a principle is that the tablet key should have more changes in the value, if the tablet key is set to gender, only "male" and "female" two values, then this set is divided into two pieces, if the collection is too large, this shard will not ultimately solve the problem of efficiency! Here we can see that the selection of the chip key and the creation of the index when the key selection principle is similar, in practice, the usual slice key is to create the index using the key!

Here first introduce the introduction and principle of the Shard and so on, the next we will build their first shard out!

MongoDB: sharding (Introduction & Auto Shard & Tablet key)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MongoDB: sharding (Introduction & Auto Shard & Tablet key)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MongoDB: sharding (Introduction & Auto Shard & Tablet key)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support