How orphan documents are generated (MongoDB orphaned document)

Source: Internet
Author: User
Tags log log python script

Developers who use MongoDB should have heard of the orphan document (orphaned documents), which smelled silence and tears. This article is based on MongoDB3.0 to see how to produce a orphaned document, requires MongoDB operation mode needs to be sharded cluster, if this part is not very understanding, you can refer to this article.

In MongoDB's official documentation, the description of orphaned document is simple:

In a sharded cluster, orphaned documents is those documents on a shard the also exist in chunks on other shards as a res Ult of failed migrations or incomplete migration cleanup due to abnormal shutdown. Delete orphaned documents using to cleanupOrphaned reclaim disk space and reduce confusion

  As you can see, orphaned document refers to the document that exists in the sharded cluster environment at the same time on different shard. We know that in MongoDB sharded cluster, the subsets of data distributed across different Shard are orthogonal, that is, theoretically a document can only appear on a shard, and the mapping between document and Shard is maintained in the Config The server. The official documentation indicates the possible creation of orphaned document: During the chunk migration, the Mongod instance goes down unexpectedly, causing the migration process to fail or partially complete. It is also noted in the documentation that you can use cleanupOrphaned to delete orphaned document.

News reports disasters, accidents, usually have such a unspoken rule: the shorter the content, the matter is serious. Do not know MongoDB for orphaned document is not also adopted this routine, one to orphaned document possible reason description is not exhaustive, and secondly also did not provide the method to detect the existence of orphaned document. For cleanuporphaned, it also takes a certain amount of courage to use it in a production environment.

Orphaned document Generation reason

As a common application developer who has not seen the MongoDB source code, Pat the head think, Chuck's migration should have the following three steps: Copy the data from the source shard to the target shard, update the metadata in config server, delete the data from the source shard. Of course, the order of the three steps is not necessarily the order above. These three steps, if you can guarantee the atomicity, then theoretically there is no problem. However, it is not clear how the orphaned document has appeared in particular.

The previous days when browsing the official documentation, it was found that a description of the migration process (chunk-migration-procedure), the approximate process translated as follows:

    1. Balancer Send Movechunk command to Source Shard
    2. The source Shard executes the Movechunk command internally and ensures that during the migration, the newly inserted document is still written to the source Shard
    3. Target Shard Create the desired index, if desired
    4. Target Shard request data from the source shard; Note that this is a copy operation, not a move operation
    5. After receiving the last document from Chunk, the target shard initiates a synchronous copy process that ensures that the copy is copied to the relevant document on the source shard during the migration process.
    6. After a full synchronization, the target Shard reports the new metadata (chunk's new location information) to config server
    7. After the previous step is complete, source Shard begins to delete the old document

If you can guarantee the atomicity of the above operation, there should be no problem in any step, if there is no guarantee, then the machine down in step 4,5,6,7, there may be problems. For the cause of the problem, the official website (chunk-migration-queuing) explains this:

The balancer does not wait for the current migration's delete phase to complete before starting the next chunk migration

This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as when per  Forming initial data loads without pre-splitting and when adding new shards. If Multiple Delete phases is queued but not yet complete, a crash of the replica set ' s primary can orphan data from multi ple migrations.

In short, in order to speed up the chunk migration (for example, when the new Shard joins, there are a lot of chunk migrations), so delete phase (7th) is not executed immediately, but in a queue, asynchronous execution, if crash, can produce orphan document

produce a orphaned document

Based on official documentation, how do I generate a orphaned document? My idea is simple: monitor the MongoDB log, kill the primary! in Shard when a log appears that flags the migration process

Pre-knowledge

In the article "Creating sharded cluster to get to know MongoDB by step-by-step", I described in detail how to build a sharded cluster, in my case, using two Shard, each shard including a primary, A secondary, a arbiter. In addition, a db-test_db is created that allows sharding, and then sharded_col this collection using _id shards, which is based on this sharded cluster. However, it is important to note that in order to conserve disk space, I have disabled the journal mechanism of the Mongod instance (--nojourbal in the boot option), but in this article, to try to match the real situation, --journal is used to enable the journal mechanism when the Mongod is started.

In addition, add two points, the first is the condition of the chunk migration, only when the number of chunk between shards to a certain extent the difference will occur in the migration:

Number of Chunks Migration Threshold
Fewer than 20 2
20-79 4
Greater 8

The second is that if you do not include _id in the document, then MongoDB will automatically add this field, whose value is a objectid,objectid consisting of one part:

    • A 4-byte value representing the seconds since the Unix epoch,
    • A 3-byte machine identifier,
    • A 2-byte process ID, and
    • A 3-byte counter, starting with a random value.
Thus, in the absence of a hash sharding key (the default is Ranged Sharding key), inserting a large number of ducoment without a _id field in a short period of time is inserted into the same shard, This is also conducive to the emergence of chunk division and migration situation. Get ready

First of all, we need to know what the log looks like when chunk migrated, so I inserted some records with a Python script, sh.status () found chunk split, migrated to view the MongoDB log, and in Rs1 (sharded_ The following output is found in the primary (Rs1_1.log) of the primary shard) of Col This collection:

 2017-07-06t21:43:21.629+0800 I NETWORK [conn6] Starting new replica set monitor for replica set rs2 with seeds 127.0. 0.1:27021,127.0.0.1:27022 2017-07-06t21:43:23.685+0800 I sharding [conn6] Movechunk data transfer progress: {Active:t Rue, ns: "Test_db.sharded_col", From: "rs1/127.0.0.1:27018,127.0.0.1:27019", min: {_id:objectid (' 595e3e74d71ffd5c7be 8c8b7 ')}, Max: {_id:maxkey}, Shardkeypattern: {_id:1.0}, State: "Steady", Counts: {cloned:1, clonedbytes:83944, Cat chup:0, steady:0}, ok:1.0, $gleStats: {lastoptime:timestamp 0|0, Electionid:objectid (' 595e3b0ff70a0e5c3d75d6 + ')}} My mem used:0 52-017-07-06t21:43:23.977+0800 I sharding [conn6] Movechunk Migrate commit accepted by To-shard: {active:false, ns: "Test_db.sharded_col", From: "Rs1/127.0.0.1:27018,12 7.0.0.1:27019", min: {_id:objectid (' 595e3e7 4d71ffd5c7be8c8b7 ')}, Max: {_id:maxkey}, Shardkeypattern: {_id:1.0}, State: "Done", counts: {cloned:1, Clonedbyte s:83944, catchup:0, steady:0}, ok:1.0, $gleStats: {lastoptime:timestamp 0|0, Electionid:objectid (' 595e3b0ff70a0e5c3d75d684 ')}}, W0 17-07-06t21:43:23.977+0800 I sharding [conn6] movechunk updating self version to:3|1| | 590a8d4cd2575f23f5d0c9f3 through {_id:objectid (' 5937e11f48e2c04f793b1242 ')}, {_id:objectid (' 595b829fd71ffd54 6f9e5b05 ')} for collection ' Test_db.sharded_col ' 2017-07-06t21:43:23.977+0800 I NETWORK [CONN6] Syncclusterconnectio n connecting to [127.0.0.1:40000], 2017-07-06t21:43:23.978+0800 I NETWORK [conn6] syncclusterconnection connecting to [ 127.0.0.1:40001] 2017-07-06t21:43:23.978+0800 I NETWORK [conn6] syncclusterconnection connecting to [127.0.0.1:40002 ] 2017-07-06t21:43:24.413+0800 I sharding [conn6] About to log metadata event: {_id: "xxx-2017-07-06t13:43:24-595e3e7c 0db0d72b7244e620 ", Server:" XXX ", clientaddr:" 127.0.0.1:52312 ", Time:new Date (1499348604413), What:" Movechunk.commit   ", NS:" Test_db.sharded_col ", details: {min: {_id:objectid ( ' 595e3e74d71ffd5c7be8c8b7 ')}, Max: {_id:maxkey}, from: "Rs1", To: "Rs2", Cloned:1, clonedbytes:83944, catchup:0, S teady:0}} 2017-07-06t21:43:24.417+0800 I sharding [conn6] Migratefromstatus::d one about-to-acquire global lock to EX It critical sections 2017-07-06t21:43:24.417+0800 I sharding [conn6] forking for cleanup of chunk data 2017-07-06t21: 43:24.417+0800 I sharding [conn6] Migratefromstatus::D one to acquire global lock to exit critical section 61 2017-07 -06t21:43:24.417+0800 I sharding [rangedeleter] deleter starting delete for:test_db.sharded_col from {_id:objectid (' 595 E3e74d71ffd5c7be8c8b7 ')}, {_id:maxkey}, with Opid:6 2017-07-06t21:43:24.417+0800 I sharding [Rangedeleter] Ra Ngedeleter deleted 1 documents for Test_db.sharded_col from {_id:objectid (' 595e3e74d71ffd5c7be8c8b7 ')}} {_id:max Key}

Line 59th above, "forking for cleanup of chunk data", appears to be ready to delete the old

So I wrote a shell script: "Forking for cleanup of chunk data" in the Rs1_1.log log, kill rs1_1 This process, the script is as follows:
Check_loop () {Echo 'checking'ret=`grep-C'forking for cleanup of chunk data'/home/mongo_db/log/Rs1_1.log 'if[$ret-GT0]; Then         Echo "Would kill Rs1 primary"         Kill-S9`PSAux |grepRs1_1 |awk '{print $}'' Exit0    firet=`grep-C'forking for cleanup of chunk data'/home/mongo_db/log/Rs2_1.log 'if[$ret-GT0]; Then         Echo "Would kill Rs2 primary"         Kill-S9`PSAux |grepRs2_1 |awk '{print $}'' Exit0    fi    Sleep 0.1Check_loop}check_loop

First attempt

The first attempt is to use the above script.

First run the shell script above, and then a terminal to start inserting data, after the shell script kill process, immediately boarded rs1 and rs2 to view the statistics, found that there is no orphaned document (how to detect the second attempt to see)

Looking back at the previous log, almost the same moment "forking for cleanup of chunk data" appeared "Rangedeleter deleted 1 documents for Test_db.sharded_col from" , which indicates that the data has been deleted. Shell script 0.1s is checked once, and it is likely that the kill signal is not sent until the migration process is complete. The time to kill is then checked in the shell script for "Movechunk Migrate commit accepted" (line 52nd in the above document)

The modification of the shell script is also simple, replace the grep content:

Check_loop () {Echo 'checking'ret=`grep-C'Movechunk Migrate Commit accepted'/home/mongo_db/log/Rs1_1.log 'if[$ret-GT0]; Then         Echo "Would kill Rs1 primary"         Kill-S9`PSAux |grepRs1_1 |awk '{print $}'' Exit0    firet=`grep-C'Movechunk Migrate Commit accepted'/home/mongo_db/log/Rs2_1.log 'if[$ret-GT0]; Then         Echo "Would kill Rs2 primary"         Kill-S9`PSAux |grepRs2_1 |awk '{print $}'' Exit0    fi    Sleep 0.1Check_loop}check_loop
View Code

Second attempt

Before the second attempt, the records in the Sharded_col were emptied, and the chunk migrations were generated faster over time.

Steps between repetitions: Start the shell script, then insert the data, wait for the shell script to kill the process and terminate

Soon, the shell script terminates, via PS aux | grep MONGO also confirmed that Rs1_1 was killed and logged into MONGOs (MONGO--port 27017)

mongos> Db.sharded_col.find (). Count ()4Then log in to Rs1 primary (at this time because Rs1 original primary was killed, the new primary is rs1-2, port number 27019)rs1:primary> db.sharded_col.find ({}, {' _id ': 1}){"_id": ObjectId ("595ef413d71ffd4a82dea30d")}{"_id": ObjectId ("595ef413d71ffd4a82dea30e")}{ "_id": ObjectId ("595ef413d71ffd4a82dea30f") }Log in again to RS2 primaryrs2:primary> db.sharded_col.find ({}, {' _id ': 1}){ "_id": ObjectId ("595ef413d71ffd4a82dea30f") }Obviously, this record of orphaned docuemnt,objectid ("595ef413d71ffd4a82dea30f") is present on two Shard, so MongoDB sharded Cluter think this record should exist on which Shard, simple way to directly usesh.status ()Check it out, but forget it here. Another way, add a new field to this record, and then query on two shardmongos> db.sharded_col.update ({' _id ': ObjectId ("595ef413d71ffd4a82dea30f")}, {$set: {' newattr ': Ten}})Writeresult ({"nmatched": 1, "nupserted": 0, "nmodified": 1})rs1:primary> db.sharded_col.find ({' _id ': ObjectId ("595ef413d71ffd4a82dea30f")}, {' Newattr ': 1}){"_id": ObjectId ("595ef413d71ffd4a82dea30f"), "Newattr": ten}  rs2:primary> db.sharded_col.find ({' _id ': ObjectId ("595ef413d71ffd4a82dea30f")}, {' Newattr ': 1}){"_id": ObjectId ("595ef413d71ffd4a82dea30f")}This confirms that this record is on the Shard of Rs1.

At this point, restart Rs1_1, Rs.status () after viewing the rs1 this shard normal, re-view Sh.status (), found that the results are the same. It is inferred that there is no journal information to restore the terminated migration process.

Therefore, in this experiment, ObjectId ("595ef413d71ffd4a82dea30f") This record was originally to move from RS to Rs2, because the man killed the Rs1 primary, resulting in a migration only part, resulting in orphaned Document Recalling the migration process mentioned earlier, the Kill for Rs1 primary in this experiment occurs before the 6th step (target shard before updating metadata on config server)using cleanuporphaned

The effect of orphaned document is that some queries will have more records: The orphan documents, such as the previous count operation, are actually only 3 records, but 4 records are returned. If the query does not use Sharding key (here _id) exact match, will also return redundant records, in addition, even if the use of Sharding key, but if the use of $in, or range query, can be an error. Like what:

mongos> db.sharded_col.find ({' _id ': {$in: [ObjectId ("595ef413d71ffd4a82dea30f")]}}). Count ()1mongos> db.sharded_col.find ({' _id ': {$in: [ObjectId ("595ef413d71ffd4a82dea30f"), ObjectId (" 595ef413d71ffd4a82dea30d ")]}). Count ()3

The second query statement above, using $in, should theoretically return two records, because the orphan document, returned three.

Essentially, if an operation is to be routed to more than one shard, there may be an error in the case of orphaned document. This makes it impossible for application developers to keep up with the exception of orphan documents in logic.

MongoDB offers the ultimate weapon to solve this problem, cleanupOrphaned, then let's try to delete all orphaned documents according to the official document (Remove-all-orphaned-documents-from-a-shard). Note that the cleanuporphaned is to be performed on the primary of Shard;
The Run in the database directly on the instance. is the
cleanupOrphaned admin mongod primary Replica set member of the Shard. Does not run on cleanupOrphaned a mongos instance.

But in which Shard is executed, is "the right Shard", or "the Wrong Shard", the document is not written clearly, I would like to execute on the two shard should not be a problem.

The results are the same regardless of whether they are executed in rs1 or rs2:

{"OK": 0,"errmsg": "Server is not part of a sharded cluster or the sharding metadata are not yet initialized."}

Errmsg:server is not part of a sharded cluster or the sharding metadata are not yet initialized.

View by Sh.status ():

Shards:{"_id": "Rs1", "host": "rs1/127.0.0.1:27018,127.0.0.1:27019"}{"_id": "Rs2", "host": "rs2/127.0.0.1:27021,127.0.0.1:27022"}

Obviously, rs1 and RS2 are all part of the sharded cluster, so it is possible that "sharding metadata is not yet initialized"

With regard to this error, the same problem was pointed out in the Https://groups.google.com/forum/#!msg/mongodb-user/cRr7SbE1xlU/VytVnX-ffp8J discussion, but there seems to be no perfect solution. At least, I have tried the methods mentioned in the discussion, but none of them have worked.

Summarize

This article is just trying to reproduce the orphan documentation, and of course I believe there are more ways to find orphan documents. In addition, according to the above log, choose at different time kill Shard (replica set) primary should have different effects, that is, during the migration process to terminate the migration process at different stages. In addition, through experiments, it is found that the cleanuporphaned instruction is not expected to work, and in some cases the orphan documents produced may not be able to be cleaned up. Of course, it is possible that my posture is not right, welcome to the reader to experiment.

References

Orphaned document

cleanupOrphaned

Learn MongoDB by creating sharded cluster step by step

Chunk-migration-procedure

Chunk-migration-queuing

Remove-all-orphaned-documents-from-a-shard

How orphan documents are generated (MongoDB orphaned document)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.