Mongodb sharding principle learning and trial (6) chunk manual Cutting

Last Update:2018-12-07 Source: Internet

Author: User

Tags mongodb sharding

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Manual chunk cutting mainly involves two functions: splitAt (fullname, middle) and splitFind (fullname, find). fullname, which specifies the set of databases. Both middle and find are conditions, which indicates the chunk you want to manually cut. Note that the condition must contain a key. Otherwise, an error is reported, as shown in.

The two functions are different:

1.1 splitAt uses the condition "middle" to find the corresponding chunk, and splits the original chunk into two parts based on the first result queried by this condition.

(1) There are three sar blocks before manual splitting. For example

(2) execute the cutting command

The chunks distribution is as follows:

It can be seen that, after the execution, the second chunk is separated into two parts by the {_ id: ObjectId ("50dc0790525e4314024b79d0")} value.

The 1.2 function splitFind, explained on the official website, splits the first chunk to two chunks with the same size, but I found that this is not the case during the test. Version 2.2.2

(1) prepare data. Insert million rows of regular data into a new set. For example. The end of the field name is an auto-increment number.

(2) After the data is inserted, the data in the set is segmented, as shown in.

(3) Let's see what the last piece of data is.

(4) We can see that the name value of the last data entry is "habc780335". That is to say, the first data entry contains million pieces. The first part is now divided into two parts. Use the splitFind () function.

(5) The value of name must be in the first partition. According to the official website, the first block should be divided into two equal-size blocks. But actually?

(6) For example, the first block is indeed divided into two parts. The ID value of the last line of the second block is the ID value of the last line of the first block. But are the two sizes equal to 1 and 2? For example.

(7) For example. The last piece of data is actually the first piece of data. This indicates that the first block actually only has one data entry. Obviously, the size of the two blocks is not equal. What is the real situation ???

2. A problem occurred today when I added a removed shard. Records.

Problem: After removing a shard, I did not stop it. Later, I added it for testing. Db. runCommand ({addshard: "hostname: port"}); Operation prompt: Successful. Migration Data is also started. After the migration is complete, I perform the query operation and find that the operation fails. The error message is "gotshardname different than what I had before ". For example

The error message is that the name is assigned to shard0001 when the shard was previously added. After the shard is added again, the name is assigned to shardworkflow. During the query operation, the names are different. So an error is reported. I am confused about how it knows the name value before this shard. I searched all the collections in the config database. No place was found to store the name value before shard. Only the shards set stores relevant data. However, all shard data is stored at this moment. I asked du Niang and google. There is nothing to gain. I can't help it later. I want to remove this shard and add it again. No.

Although the operation prompt is successful, after a long time, I found that the data was not migrated at all. Query logs and an error message is displayed.

The error message is still different before and after the name value. I restarted the shard and found that the problem was solved.

Analysis: The name value of each added shard is not only stored in the config database, but also stored in each shard. The difference is that the data in the config database is implemented, but each shard is cached. If the removed shard does not restart, the name value will always exist. One thing I don't understand is that since the name value exists and the name is different after it is added again, it can be added successfully, and the data is also migrated, but an error is reported only when the query operation is performed. Is the name value of a different place used for data migration and data query? Note that, although the query operation fails, the write operation can be successful.

How to Avoid: db. runCommand ({addshard: "hostname: port", name: "xxx"}) VS db, runCommand ({addshard: "hostname: port "}). When addshard is used, if the name value is not specified, the system uses the default value to increase progressively from shard0000. Therefore, you must manually specify the name value in addshard.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More