Recently, we need to analyze the recoverlease process and analyze the recoverlease process.
I. recoverlease
Recoverlease is used to recover a lease. I understand it as a lease before releasing a file, close the file, and report namenode.
Recoverlease has two paths to call
1. distributedfilesystem. Create-> dfsclient. Create-> namenode. Create-> fsnamesystem. startfile-> response. startfileinternal-> recoverleaseinternal (myfile, SRC, holder, clientmachine, false)
This path is called when the client file is created. In this case, it does not need to close the file.
2. distributedfilesystem. recoverlease-> dfsclient. recoverlease-> namenode. recoverlease (SRC, clientname)-> fsnamesystem. recoverlease (SRC, holder, clientmachine)-> recoverleaseinternal (inode, SRC, holder, clientmachine, true)
This path explicitly calls a path recoverlease on the client. In this case, it needs to close the file. HDFS-1554
The above two paths will eventually call fsnamesystem. recoverleaseinternal. Let's take a look at recoverleaseinternal, which is mainly used to do the following:
1. Get pendingfile and get the lease of the current holder
2. If lease is not empty and is not force, alreadybeingcreatedexception
3. Obtain the lease of the current client. If lease is empty, alreadybeingcreatedexception
4. Force internalreleaseleaseone if force is enabled. Otherwise, internalreleaselease is also forced if softlimit is exceeded, but alreadybeingcreatedexception is also thrown.
It calls internalreleaseleaseone to execute recoverblock.
Recoverleaseinternal-> internalreleaselease-> internalreleaseleaseone
Internalreleaseleaseone process:
1. Find the targets of the last block of the file.
2. pendingfile. assignprimarydatanode ()-> datanodedescriptor. addblocktoberecovered ()-> recoverblocks. Offer
Recoverblocks returns the recoverblock command to datanode as the holder of nn_recovery when datanode sends a heartbeat packet to namenode.
3. reassignlease
Ii. recoverblock Process
Recoverblock is executed on datanode and has two paths, one initiated by namenode and the other initiated by dfsclient.
1. Initiate recoverblock for namenode
A) Process on Namenode:
Namenode. sendHeartbeat (datanode call, send heartbeat packet to namenode, namenode returns the cmd-> FSNamesystem. handleHeartbeat->
DatanodeDescritor. getLeaseRecoveryCommand-> recoverBlocks. poll ()
According to the above recoverLease analysis, if a recoverLease request exists, the recoverBlocks of the BlockQueue type will be processed.
B) Process on datanode
Datanode. run-> offerService () (always running, interacting with namenode)-> processCommand (cmds [])-> processCommand (cmd)->
DNA _recoverblock-> recoverBlocks (bcmd. getBlocks (), bcmd. getTargets ()-> recoverBlock (blocks [I], false, targets [I], true)
[CloseFile = true]
That is, there is a heartbeat thread on datanode. The heartbeat packet is reported to namenode every 3 seconds by default, and the cmd to be executed on datanode is obtained. If a recoverBlock request is sent from namenode,
The recoverBlock method will be called on datanode. In this case, closeFile = true. It is initiated by recoverLease. In this case, close the file and make the block of the file consistent with the information on datanode.
2. recoverBlock initiated by DFSClient
Dfsclient. processdatanodeerror-> datanode. recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets)-> datanode. recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets, false) [closefile = false]
Dfsclient sends block packet data through datastreamer. If an exception occurs during this process, processdatanodeerror will perform recover processing to obtain the wrong datanode in pipeline, select one datanode in the remaining two datanode to initiate recoverblock, and reconstruct the pipeline from this datanode.
In this case, closefile passed by recoverblock of datanode is called to be set to false, because recover is required when the dfsclient writes an exception to the block, rather than closing the file.
Recoverblock (Block block, Boolean keeplength, datanodeinfo [] targets, Boolean closefile)
1. Check whether the block in ongoingrecovery is being recovered. If yes, an ioexception is thrown. The block is already being recovered and ignoring this request to recover it. If no, add to ongoingrecovery
2. Create a synclist based on targets, that is, to determine which nodes to sync and which block
---> Syncblock
1. first obtain the generationstamp from the namenode for this syncblock.
2. Create a New newblock with the new generationstamp
3. Perform the updateblock operation on each datanode in synclist to update the old block to newblock.
4. If the execution is successful, report commitblocksynchronization to namenode, including the new block and generationstamp.
5. Return locatedblock
---> Updateblock
1. fsdataset. updateblock
2. If finalize is true, fsdataset. finalizeblockifneeded; Notifies namenode that it has already completed ed Block
Above, record for memo.