Cassandra 3.0 Data Repair mechanism

Source: Internet
Author: User
Tags cassandra

Reference
Https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesTOC.html

Premise:
per copy of data n, write consistency level is W, read consistency level is R hinted Handoff (prompt handover): Write Fix

The write operation will send n write requests, but only the W is counted. For a different n-w node, if the write fails, the hint is logged.

hint content target ID: target node hint ID: Data timestamp message Id:cassandra version blob: Data

Write-time fixes include five scenarios:

Conformance level does not meet
Throws a Unavailableexception exception when the user-specified consistency level is not met, or the coordinator is hung.

Consistency level any
Writing hint to the coordinator node is also considered a write success.

Fail-over detection mechanism has been flagged node hangs

When hinted handoff is opened in the Cassandra configuration, the lost write operation is stored in the coordinator node in hint format for a period of time T,hint saved in the coordinator's local hints directory, updated every 10 seconds. When the node resumes, write each hint in hints to the recovered node. If the node is more than Max_hint_window_in_ms (3 hours) has not recovered, stop writing the new hints.

I'm not even missing the mark.

When the node is not yet marked to hang, when the write operation exceeds Write_request_timeout_in_ms (10 seconds), the coordinator node returns a timeoutexception, the write operation fails, and a hint is stored. When there are too many failed nodes, the coordinator counts the number of hint to write, and if it exceeds a certain value, it rejects the write operation and throws the Overloadedexception.

The target node is removed from the cluster
Delete the hints read Repair: Read fix for this node

For table with Datatieredcompactionstrategy, set Read_repair_chance to 0. No repairs are made. For other compression strategies, read-fix probabilities are generally set to 20%

Repair based on conformance level R

The read process reads R, compares the read R copies, and repairs the old data. If the r=1 is not repaired.

Random Read fix

Read the digest information of all n replicas directly or in the background, compare them, and fix the old data. Manual Repair: Inverse entropy repair Merkle Tree

Hash the data as a leaf node, and then build a parent node for each of the two nodes until the root node. Nodetool Repair

Nodetool Repair

The data responsible for this node is segmented by token ranges, and all the replica nodes involved in each section are involved in repairing

All copies of each paragraph involved will be repaired.

Nodetool REPAIR-PR

will only fix all copies of the data segment that the node is directly responsible for. (

If the range that this node is responsible for is Range1, 2, 3, and the closest range3,range3 to this node has three copies, the three replicas are updated to the latest after repair

Nodetool Repair-inc

Incremental fix, repair process leader Send repair request to other related nodes, other nodes determine whether the sstable is fixed by Repairedat field in metadata, and only Merkle tree is generated for the sstable that are not repaired. Then summarize to Leader,leader to compare and restore. After the recovery, send a anticompaction command to store the repaired and not repaired range in a different sstable. The repaired sstable is identified with the Repairedat field and the value is the time of the repair.
Repair and compaction are not mutually exclusive, so may not have been repaired and deleted, one will be repaired again. This affects efficiency, but does not affect correctness.
Anticompaction has two strategies: size-tiered and leveled
When a sstable is fully covered with a fixed range, no anticompaction is performed. Update only the Repairedat field.

We are interested to be able to pay attention to my public number (Distributed system bucket), involving distributed systems, big data and personal growth sharing, welcome everyone to Exchange progress

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.