Mongodb sharding principle learning and trial (6) chunk manual Cutting

Source: Internet
Author: User
Tags mongodb sharding

1. Manual chunk cutting mainly involves two functions: splitAt (fullname, middle) and splitFind (fullname, find). fullname, which specifies the set of databases. Both middle and find are conditions, which indicates the chunk you want to manually cut. Note that the condition must contain a key. Otherwise, an error is reported, as shown in.

  

The two functions are different:

1.1 splitAt uses the condition "middle" to find the corresponding chunk, and splits the original chunk into two parts based on the first result queried by this condition.

(1) There are three sar blocks before manual splitting. For example

  

(2) execute the cutting command

  

The chunks distribution is as follows:

  

It can be seen that, after the execution, the second chunk is separated into two parts by the {_ id: ObjectId ("50dc0790525e4314024b79d0")} value.

The 1.2 function splitFind, explained on the official website, splits the first chunk to two chunks with the same size, but I found that this is not the case during the test. Version 2.2.2

(1) prepare data. Insert million rows of regular data into a new set. For example. The end of the field name is an auto-increment number.

  

(2) After the data is inserted, the data in the set is segmented, as shown in.

  

(3) Let's see what the last piece of data is.

  

(4) We can see that the name value of the last data entry is "habc780335". That is to say, the first data entry contains million pieces. The first part is now divided into two parts. Use the splitFind () function.

  

(5) The value of name must be in the first partition. According to the official website, the first block should be divided into two equal-size blocks. But actually?

  

(6) For example, the first block is indeed divided into two parts. The ID value of the last line of the second block is the ID value of the last line of the first block. But are the two sizes equal to 1 and 2? For example.

  

(7) For example. The last piece of data is actually the first piece of data. This indicates that the first block actually only has one data entry. Obviously, the size of the two blocks is not equal. What is the real situation ???

2. A problem occurred today when I added a removed shard. Records.

Problem: After removing a shard, I did not stop it. Later, I added it for testing. Db. runCommand ({addshard: "hostname: port"}); Operation prompt: Successful. Migration Data is also started. After the migration is complete, I perform the query operation and find that the operation fails. The error message is "gotshardname different than what I had before ". For example

The error message is that the name is assigned to shard0001 when the shard was previously added. After the shard is added again, the name is assigned to shardworkflow. During the query operation, the names are different. So an error is reported. I am confused about how it knows the name value before this shard. I searched all the collections in the config database. No place was found to store the name value before shard. Only the shards set stores relevant data. However, all shard data is stored at this moment. I asked du Niang and google. There is nothing to gain. I can't help it later. I want to remove this shard and add it again. No.

  

Although the operation prompt is successful, after a long time, I found that the data was not migrated at all. Query logs and an error message is displayed.

  

The error message is still different before and after the name value. I restarted the shard and found that the problem was solved.

Analysis: The name value of each added shard is not only stored in the config database, but also stored in each shard. The difference is that the data in the config database is implemented, but each shard is cached. If the removed shard does not restart, the name value will always exist. One thing I don't understand is that since the name value exists and the name is different after it is added again, it can be added successfully, and the data is also migrated, but an error is reported only when the query operation is performed. Is the name value of a different place used for data migration and data query? Note that, although the query operation fails, the write operation can be successful.

How to Avoid: db. runCommand ({addshard: "hostname: port", name: "xxx"}) VS db, runCommand ({addshard: "hostname: port "}). When addshard is used, if the name value is not specified, the system uses the default value to increase progressively from shard0000. Therefore, you must manually specify the name value in addshard.

 

 

 

 

  

  

  

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.