Process Analysis of recoverlease and recoverblock in HDFS

Source: Internet
Author: User

Recently, we need to analyze the recoverlease process and analyze the recoverlease process.

I. recoverlease

Recoverlease is used to recover a lease. I understand it as a lease before releasing a file, close the file, and report namenode.

Recoverlease has two paths to call

1. distributedfilesystem. Create-> dfsclient. Create-> namenode. Create-> fsnamesystem. startfile-> response. startfileinternal-> recoverleaseinternal (myfile, SRC, holder, clientmachine, false)

This path is called when the client file is created. In this case, it does not need to close the file.

2. distributedfilesystem. recoverlease-> dfsclient. recoverlease-> namenode. recoverlease (SRC, clientname)-> fsnamesystem. recoverlease (SRC, holder, clientmachine)-> recoverleaseinternal (inode, SRC, holder, clientmachine, true)

This path explicitly calls a path recoverlease on the client. In this case, it needs to close the file. HDFS-1554

The above two paths will eventually call fsnamesystem. recoverleaseinternal. Let's take a look at recoverleaseinternal, which is mainly used to do the following:

1. Get pendingfile and get the lease of the current holder
2. If lease is not empty and is not force, alreadybeingcreatedexception
3. Obtain the lease of the current client. If lease is empty, alreadybeingcreatedexception
4. Force internalreleaseleaseone if force is enabled. Otherwise, internalreleaselease is also forced if softlimit is exceeded, but alreadybeingcreatedexception is also thrown.

It calls internalreleaseleaseone to execute recoverblock.

Recoverleaseinternal-> internalreleaselease-> internalreleaseleaseone

Internalreleaseleaseone process:

1. Find the targets of the last block of the file.
2. pendingfile. assignprimarydatanode ()-> datanodedescriptor. addblocktoberecovered ()-> recoverblocks. Offer

Recoverblocks returns the recoverblock command to datanode as the holder of nn_recovery when datanode sends a heartbeat packet to namenode.
3. reassignlease

Ii. recoverblock Process

Recoverblock is executed on datanode and has two paths, one initiated by namenode and the other initiated by dfsclient.

1. Initiate recoverblock for namenode

A) Process on Namenode:

Namenode. sendHeartbeat (datanode call, send heartbeat packet to namenode, namenode returns the cmd-> FSNamesystem. handleHeartbeat->

DatanodeDescritor. getLeaseRecoveryCommand-> recoverBlocks. poll ()

According to the above recoverLease analysis, if a recoverLease request exists, the recoverBlocks of the BlockQueue type will be processed.

B) Process on datanode

Datanode. run-> offerService () (always running, interacting with namenode)-> processCommand (cmds [])-> processCommand (cmd)->
DNA _recoverblock-> recoverBlocks (bcmd. getBlocks (), bcmd. getTargets ()-> recoverBlock (blocks [I], false, targets [I], true)
[CloseFile = true]

That is, there is a heartbeat thread on datanode. The heartbeat packet is reported to namenode every 3 seconds by default, and the cmd to be executed on datanode is obtained. If a recoverBlock request is sent from namenode,

The recoverBlock method will be called on datanode. In this case, closeFile = true. It is initiated by recoverLease. In this case, close the file and make the block of the file consistent with the information on datanode.

2. recoverBlock initiated by DFSClient

Dfsclient. processdatanodeerror-> datanode. recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets)-> datanode. recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets, false) [closefile = false]

Dfsclient sends block packet data through datastreamer. If an exception occurs during this process, processdatanodeerror will perform recover processing to obtain the wrong datanode in pipeline, select one datanode in the remaining two datanode to initiate recoverblock, and reconstruct the pipeline from this datanode.

In this case, closefile passed by recoverblock of datanode is called to be set to false, because recover is required when the dfsclient writes an exception to the block, rather than closing the file.

Recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets, Boolean closefile)

1. Check whether the block in ongoingrecovery is being recovered. If yes, an ioexception is thrown. The block is already being recovered and ignoring this request to recover it. If no, add to ongoingrecovery
2. Create a synclist based on targets, that is, to determine which nodes to sync and which block

---> Syncblock
1. first obtain the generationstamp from the namenode for this syncblock.
2. Create a New newblock with the new generationstamp
3. Perform the updateblock operation on each datanode in synclist to update the old block to newblock.
4. If the execution is successful, report commitblocksynchronization to namenode, including the new block and generationstamp.
5. Return locatedblock

---> Updateblock
1. fsdataset. updateblock
2. If finalize is true, fsdataset. finalizeblockifneeded; Notifies namenode that it has already completed ed Block

Above, record for memo.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.