The underreplicatedblocks of HDFs source code analysis (I.)

Source: Internet
Author: User
Tags cas

Underreplicatedblocks is an important data structure for block replication in HDFs. In HDFs's high-performance, high-fault-tolerant system, there are some reasons for block duplication in the HDFS system, such as high-performance load balancing, fault-tolerant block copy number recovery, and so on. Generally, any job will have a priority problem, especially where the data block replication, it is impossible to simply follow the first-in-first-out or other simple strategy, for example, fault-tolerant data block copy number recovery, especially the data block copy only one copy number of data block recovery, Its priority must be higher than high-performance-based load balancing, so block replication to have a priority concept, then the priority of the data block replication, how to determine, how to store? All the answers are in underreplicatedblocks, this article we will begin to analyze Underreplicatedblocks.

The underreplicatedblocks is designed to store data blocks that need to be replicated in order to be prioritized. What is the data block replication priority, we look at several static member variables in the Underreplicatedblocks class and their descriptions are answered as follows:

  /** the queue with the highest priorities: {@value} *  //Priority value of the highest priority queue 0  static final int queue_highest_priority = 0;  /** the queue for blocks that is below their expected value: {@value} *//  priority value of the second priority Queue 1: Mainly for data blocks that require a lot less than the number of replicas  static final int queue_very_under_replicated = 1;  /** the queue for "normally" under-replicated blocks: {@value} *//  Third Priority queue priority value 2: Mainly for data blocks  less than the number of replicas required, i.e. general case static final int queue_under_replicated = 2;  /** the queue for blocks that has the right number of replicas,   * but which the block manager felt were badly distri buted: {@value}   * */  /Fourth Priority queue priority value 3: mainly for replicas, but the block manager Blockmanager feel heavily distributed uneven  static final int queue_ replicas_badly_distributed = 3;  /** the queue for corrupt blocks: {@value} *//  /Fifth Priority queue priority value 4: Primary for corrupted block  static final int queue_with_corrupt_b LOCKS = 4;
The data block replication priority is divided into five levels, from high to the following:

1, queue_highest_priority = 0: Highest priority

Mainly for the number of copies of the data block is very serious, the current number of replicas is lower than expected, and only 1 or simply not, such as the number of replicas only 1, or the number of replicas is altogether 0, but there are retired copies, this situation is most dangerous, the data is most likely to be lost, so the priority of replication is highest;

2, queue_very_under_replicated = 1: Second priority

Mainly for the number of copies of the data block is less than the situation, better than the above, the current number of replicas is lower than expected, but the number of replicas is greater than 1, the formula is the current number of copies Curreplicas times 3 is less than the expected number of replicas Expectedreplicas, this situation is also more dangerous, Data is also easily lost, so the priority of replication is also high;

3, queue_under_replicated = 2: Third priority

Mainly for the number of copies of the data block less than expected, but not very serious, very critical situation;

4, queue_replicas_badly_distributed = 3: Fourth priority

The main target of the data block has enough copies, but there is not enough racks, this is the load balancing strategy required by the product;

5, Queue_with_corrupt_blocks = 4: Fifth priority

Mainly for the case of corrupted data block, its copy digit 0, but there is no retired copy, so the lowest priority, say, this data block still need oh copy? Leave a little doubt about it!

Through the above instructions, we can briefly summarize the following:

If the current number of replicas is lower than expected, if the current number of replicas is 1, even if there is a retired copy in the case of 0 o'clock, its replication priority is highest, if the current number of replicas is 0 and there is no retired copy, the replication priority is the lowest, if the current number of replicas is greater than 1, but multiplied by 3 is less than the Priority, otherwise it is the third priority. When the current number of replicas equals or exceeds the expected value, it is possible that there are not enough racks, at which point the priority is slightly higher than the lowest priority, which is the fourth priority.

Underreplicatedblocks also provides the GetPriority () method for obtaining the replication priority based on the data block and its replica, the code is as follows:

  /** Return The priority of a block * calculates the specified data block replication Precedence * * @param block A under replicated block * @param Curreplica  s current number of replicas of the block * @param Expectedreplicas expected number of replicas of the block * @return                          The priority for the blocks, between 0 and ({@link #LEVEL}-1) */private int getpriority (block block, int Curreplicas, int decommissionedreplicas, int expectedrepli CAS) {//Parameter check: The current number of replicas Curreplicas should be greater than or equal to 0assert curreplicas >= 0: "Negative replicas!"; /If the current number of replicas Curreplicas is greater than or equal to the expected number of replicas, the priority value of the fourth priority queue is returned 3 if (Curreplicas >= expectedreplicas) {//data block already has enough copies, but no    Enough racks//Block has enough copies, but not enough racks return queue_replicas_badly_distributed; } else if (Curreplicas = = 0) {//If the current number of replicas Curreplicas 0//If there is zero non-decommissioned replicas but there is//Some decommissioned replicas, then assign them highest pRiority//If Decommissionedreplicas is greater than 0, returns the priority value of the highest priority queue 0//There is no non-retired copy, but there are retired copies, then we need to assign them the highest priority if (Decommis      Sionedreplicas > 0) {return queue_highest_priority; }//All we have is corrupt blocks//no non-retired copy, no retired copy, we think it is a corrupted data block, replication priority is lowest, priority value for fifth priority Queue 4 return Q    Ueue_with_corrupt_blocks; } else if (Curreplicas = = 1) {//If the current number of replicas Curreplicas 1//only on Replica-risk of loss//highest priority/    /There is only one copy, there is a risk of loss, so give the highest priority 0 return queue_highest_priority;      } else if ((Curreplicas * 3) < Expectedreplicas) {//there is less than a third as many blocks as requested; This is considered very under-replicated//if the current number of replicas Curreplicas multiplied by 3 is less than the expected number of replicas Expectedreplicas, returns the priority value of the second Priority queue 1 R    Eturn queue_very_under_replicated; } else {//add to the normal queue for under replicated blocks//general priority lower than the number of replicas, returns the priority value of the third Priority queue 2 return queue    _under_replicated; }
The idea is the same as the above introduction, the general logic is as follows:

1. If the current number of replicas Curreplicas is greater than or equal to the desired number of replicas, the priority value of the fourth priority queue is returned 3--queue_replicas_badly_distributed;

2, if the current number of replicas Curreplicas is 0, and if Decommissionedreplicas is greater than 0, return the highest priority queue priority value 0, there is no non-retired copy, there is no retired copy, we think it is a corrupted data block, replication priority is lowest, The priority value for the fifth priority queue is 4;

3, if the current number of copies Curreplicas 1, there is only one copy, there is a risk of loss, so give the highest priority 0;

4. If the current number of replicas Curreplicas multiplied by 3 is less than the expected number of replicas Expectedreplicas, the priority value of the second priority queue 1 is returned;

5, generally lower than the number of copies of the priority, return to the third priority queue priority value 2.


The underreplicatedblocks also provides member variables involving the replication priority queue, as follows:

  /** the total number of queues: {@value} *  //Queue count  static final int level = 5;  /** the queues themselves *  //Queue collection  private final list<lightweightlinkedset<block>> Priorityqueues      = new arraylist<lightweightlinkedset<block>> ();  /** Stores The replication index  for every priority * *//Store the replication index corresponding to each of the priorities  private Map<integer, integer> Priori Tytoreplidx = new Hashmap<integer, integer> (level);
The total number of queues is 5, and the collection of different priority blocks to be replicated is the Priorityqueues list, which is the list of Lightweightlinkedset for the Block collection. It also provides a collection of block-replicated indexes that store each priority corresponding to the PRIORITYTOREPLIDX, which is the number form precedence priority to the block in the collection Lightweightlinkedset the map of the index.

The Underreplicatedblocks constructor is as follows:

  /** Create an object. *  //constructor, create an object  underreplicatedblocks () {  ///5 Lightweightlinkedset collection, stored in the Priorityqueues list,// And the mapping of the priority to the copy index is stored in PRIORITYTOREPLIDX for    (int i = 0; I < level; i++) {      Priorityqueues.add (new Lightweightlinkedset<block> ());      Prioritytoreplidx.put (i, 0);    }  }
First, construct 5 Lightweightlinkedset collections and add them to the list priorityqueues in order of precedence from highest to lowest, and initialize each block copy priority corresponding to the position index of 0.

Underreplicatedblocks also provides the appropriate methods for adding, removing, and updating priorities, respectively, as follows:

1. Adding data Block Add ()

  /** Add a block to a under replication queue according to it priority * @param block a under replication block * @p Aram Curreplicas Current number of replicas of the block * @param decomissionedreplicas The number of decommissioned rep Licas * @param expectedreplicas expected number of replicas of the block * @return True if the block is added to a Qu   Eue. */Synchronized Boolean Add (block block, int curreplicas, int Decomi        Ssionedreplicas, int expectedreplicas) {assert curreplicas >= 0: "Negative replicas!";                               Calculates the priority of the block replication based on the data block and its replica prilevel int prilevel = getpriority (block, Curreplicas, Decomissionedreplicas,        Expectedreplicas); Returns True if the block replication priority Prilevel is less than 5 (that is, a correctly valid priority), and/or if the corresponding block collection is removed from the priorityqueues based on the priority Prilevel and the block is added to the collection successfully, the IF (prilevel! = LE VEL && Priorityqueues.get (prilevel). Add (block)) {if (Namenode.blockstatechangelOg.isdebugenabled ()) {NameNode.blockStateChangeLog.debug ("block* NameSystem.UnderReplicationBlock.add:" + Block + "have only" + Curreplicas + "replicas and need" + Expectedreplicas + "re      Plicas so was added to Neededreplications "+" at the priority level "+ prilevel);    } return true;  }//otherwise returns false return false; }
Adding a data block to the Add () method is relatively straightforward, first calculating the priority prilevel of block replication based on the invocation of the GetPriority () method in the case of a block and its replica, and then if the block replication priority Prilevel is less than 5 (that is, a correctly valid priority), And if the corresponding block collection is removed from the priorityqueues based on the priority Prilevel and the block is added to the collection successfully, returns True, indicating that the addition succeeded, or false, indicating that the addition failed.

2. Remove the data block remove ()

  /** remove a block from a under replication queue *  /Synchronized Boolean remove (block block,                               int oldreplicas,
   
    int Decommissionedreplicas,                              int oldexpectedreplicas) {    ///////////GetPriority () based on the data block and its replica Method calculates the block replication priority Prilevelint Prilevel = getpriority (block, Oldreplicas,                                Decommissionedreplicas,                               OLDEXPECTEDREPLICAS);//Call the Remove () method of two parameters to remove the data block    return Remove (block, prilevel);  }
   
  /** * Remove a block from the under replication queues. * * The Prilevel parameter is a hint of which queue to query * First:if negative or >= {@link #LEVEL} this SHORTCU   Tting * is not attmpted.   * If The block is a found in the nominated queue, an attempt is made to * remove it from all queues.   * * <i>Warning:</i> This is a synchronized method. * @param block block to remove * @param prilevel expected privilege level * @return True if the block is found and re Moved from one of the queues */Boolean remove (block block, int prilevel) {//If priority prilevel is correct and valid, and is based on priority PR Ilevel the data block is removed from the list priorityqueues//after the Block collection is fetched, returns true to remove successful if (prilevel >= 0 && Prilevel < level &A mp;& Priorityqueues.get (prilevel). Remove (block)) {if (NameNode.blockStateChangeLog.isDebugEnabled ()) {Nam ENode.blockStateChangeLog.debug ("block* NameSystem.UnderReplicationBlock.remove:" + "removing BloCK "+ block +" from priority queue "+ Prilevel);    } return true; } else {//otherwise, if removal fails in the corresponding block collection of the given priority, attempt to remove the data block from the respective queue of all priority,//any one removal succeeds, all returns true, indicating removal succeeded/try to REM      Ove The block from all queues if the block is//not found in the ' queue for the ' given priority level. for (int i = 0, i < level; i++) {if (Priorityqueues.get (i). Remove (block)) {if (Namenode.blockstatechan Gelog.isdebugenabled ()) {NameNode.blockStateChangeLog.debug ("block* namesystem.underreplicationb          Lock.remove: "+" removing block "+ block +" from priority queue "+ i);        } return true;  }}}//Finally, if not, then return false, indicating that the removal failed to return false; }
First, in the four parameter of the Remove () method, first according to the data block and its copy, call the GetPriority () method to calculate the block replication priority Prilevel, and then call the two parameter of the Remove () method, remove the block;

Second, in the two parameter of the Remove () method, if the priority prilevel is valid, and the data block collection is removed from the list priorityqueues based on the priority Prilevel, the data block is removed successfully, which returns true to remove the success; , the attempt to remove a block in a given priority corresponding block set attempts to remove the chunk from the respective queue of all priority levels, and any one removal succeeds, returning true to remove the success, and, if not, returns false, indicating that the removal failed.

3, updated priority update ()

  /** * Recalculate and potentially update the priority level of a block. * * IF The block priority have changed from before an attempt are made to * remove it from the block queue. Regardless of whether or not the block * are in the block queue of (recalculate) priority, an attempt are made * to add it to that queue. This ensures, that the block would be * is * in its expected priority queue (and only so the queue) by the end of the * method   Call. * @param block A under replicated block * @param curreplicas Current number of replicas of the block * @param decommis Sionedreplicas the number of decommissioned replicas * @param curexpectedreplicas expected number of replicas of the BL Ock * @param curreplicasdelta The change in the replicate count from before * @param expectedreplicasdelta the change                           In the expected replica count from before */synchronized void update (block block, int Curreplicas,                    int Decommissionedreplicas,       int Curexpectedreplicas, int curreplicasdelta, int expectedreplicasdelta) {//CURREPL ICAS represents the current number of replicas, Curreplicasdelta represents a change in the number of replicas that occurred before//Curexpectedreplicas represents the current expected number of replicas, Expectedreplicasdelta represents a change in the number of expected replicas that occurred before//    Count the number of copies before Oldreplicas and the expected number of replicas before oldexpectedreplicasint Oldreplicas = Curreplicas-curreplicasdelta;        int oldexpectedreplicas = Curexpectedreplicas-expectedreplicasdelta;        Calculates the current block replication priority Curpri int curpri = getpriority (block, Curreplicas, Decommissionedreplicas, Curexpectedreplicas);    Before calculating the block replication priority Oldpri int oldpri = getpriority (block, Oldreplicas, Decommissionedreplicas, Oldexpectedreplicas);         if (NameNode.stateChangeLog.isDebugEnabled ()) {NameNode.stateChangeLog.debug ("underreplicationblocks.update" + Block + "Curreplicas" + Curreplicas + "Curexpectedreplicas" + Curexpectedreplicas + "Oldrepli   CAS "+ Oldreplicas +" Oldexpectedreplicas "+ Oldexpectedreplicas +" Curpri "+ Curpri +     "Oldpri" + oldpri);      }//If the previous priority OLDPRI is legal and does not equal the current priority CURPRI if (oldpri! = level && Oldpri! = Curpri) {//Call the Remove () method to remove the data block    Remove (block, OLDPRI); }//If the current priority curpri is legal, get the corresponding block collection from the Priorityqueues list and add the block of data through the current priority Curpri (curpri! = level && Priorityque Ues.get (CURPRI). Add (block)) {if (NameNode.blockStateChangeLog.isDebugEnabled ()) {Namenode.blockstatechangelog          . Debug ("block* NameSystem.UnderReplicationBlock.update:" + BLOCK + "have only" + Curreplicas  + "Replicas and needs" + Curexpectedreplicas + "replicas so are added to Neededreplications" +      "At priority level" + CURPRI); }    }  }
Updating the priority update () method adjusts the block copy priority and adjusts its corresponding storage location in the underreplicatedblocks when changes such as the number of copies of the data block or the number of desired replicas are changed. The approximate logic is as follows:

1, first figure out a few parameters: Curreplicas represents the current number of copies, Curreplicasdelta represents the number of copies that occurred before the change, Curexpectedreplicas represents the current expected number of replicas, Expectedreplicasdelta represents a change in the number of expected replicas that occurred before;

2, the number of copies before the calculation oldreplicas and the previous expected number of copies Oldexpectedreplicas;

3, calculate the current block replication priority Curpri;

4, before the calculation of the block replication priority OLDPRI;

5, if the previous priority OLDPRI legal and does not equal the current priority CURPRI: Call the Remove () method to remove the data block;

6, if the current priority CURPRI legal, through the current priority Curpri from the Priorityqueues list to obtain the corresponding data block collection and add the data block in.





The underreplicatedblocks of HDFs source code analysis (I.)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.