In the article "HDFs Source code Analysis Editlog Get the edit log input stream," we learned more about how to get the edit log input stream editloginputstream. After we get the edit log input stream, is it time to get the data from the input stream to handle it? The answer is obvious! In the "HDFs Source code Analysis Editlogtailer" article, we are talking about editing log tracking synchronization, also mentioned the following two consecutive processing processes:
4. Get the edit log input stream collection streams from the edit log editlog to get the data for the latest transaction ID plus 1 for the input stream
5, call the file system image Fsimage Instance Image loadedits (), use the edit log input stream collection streams, load the editing log to the target Namesystem file system image Fsimage, And get the edit log load size editsloaded;
As we can see, after we get the collection streams of the edit log input stream Editloginputstream, we need to call Fsimage's Loadedits () method and use the edit log input stream collection streams, Load the edit log to the file system image Fsimage in the target namesystem. And how does HDFs read data from the edit log input stream? This article, we will carry on the detailed inquiry!
First, in the main class Fseditlogloader that loads the edit log, the Core method Loadeditrecords () has the following code:
while (true) {try {Fseditlogop op; try {//from edit log input stream in to read operator op op = In.readop (); If the operator op is empty, jump out of the loop directly and return if (OP = = null) {break; }} catch (Throwable e) {//... Omit part of code}//... Omit part of code try {//... Omit part code Long Inodeid = Applyeditlogop (OP, Fsdir, Startopt, In.getversion (True), Lastinodeid); if (Lastinodeid < Inodeid) {Lastinodeid = Inodeid; }} catch (Rollingupgradeop.rollbackexception e) {//... Omit part of code} catch (Throwable e) {//... Omit part of code}//... Omit part of code} catch (Rollingupgradeop.rollbackexception e) {//... Omit part of code} catch (Metarecoverycontext.requeststopexception e) {//... Omit part of the Code}}
It reads an operator op from the edit log input stream in, and then calls the Applyeditlogop () method, which acts on the memory meta-data fsnamesystem. So the question is, how does this operator get read and parsed from the data stream?
Next, we'll look at how to read an operator from the edit log output stream Editloginputstream, let's look at its Readop () method, the code is as follows:
/** * Read an operation from the stream * @return A operation from the stream or null if at end of stream * @ Throws IOException If there is a error reading from the stream * /Public Fseditlogop Readop () throws IOException { FSEDITLOGOP ret; If the cached CACHEDOP is not NULL, returns the cached Cachedop and empties it if (CACHEDOP! = null) { ret = CACHEDOP; CACHEDOP = null; return ret; } If the cached CACHEDOP is null, call Nextop () to process the return nextop (); }
Very simply, if the cached CACHEDOP is not NULL, the cached CACHEDOP is returned and emptied, and if the cached Cachedop is null, Nextop () is called to process. and Editloginputstream in Nextop () is an abstract method, we need to see the implementation of its subclasses, the following editlogfileinputstream as an example, see its Nextop () method:
@Override protected Fseditlogop nextop () throws IOException { return Nextopimpl (FALSE); }
Continue to follow the Nextopimpl () method with the following code:
Private Fseditlogop Nextopimpl (Boolean skipbrokenedits) throws IOException {Fseditlogop op = null; Based on the status of the edit log file input stream: switch (state) {case uninit://if the init () method is initialized for uninitialized status UnInit try {// UE); } catch (Throwable e) {log.error ("caught exception initializing" + this, e); if (skipbrokenedits) {return null; } throwables.propagateifpossible (E, ioexception.class); }//Detect edit log file input stream status, this should not be uninit preconditions.checkstate (state! = State.uninit); Call the Nextopimpl () method again to return Nextopimpl (skipbrokenedits); Case open://If the open state is opened//call Fseditlogop.reader's Readop () method, the Read operator op = Reader.readop (skipbrokenedits); if (OP! = null) && (Op.hastransactionid ())) {Long txId = Op.gettransactionid (); if (txId >= lasttxid) && (lasttxid! = Hdfsconstants.invalid_txid)) {////Someti Mes, the NameNode CRAsHes while it's writing to the//edit log. In this case, you can end up with an unfinalized edit Log//which have some garbage at the end. Journalmanager#recoverunfinalizedsegments'll finalize these//unfinished edit logs, giving them a defined F Inal transaction//ID. Then they'll is renamed, so the any subsequent//readers'll has this information. Since There may is garbage at the end of the these "cleaned up"//logs, we want to being sure to skip it Here if we ' ve read everything//We were supposed to read out of the stream. So we force a EOF on all subsequent reads. Long Skipamt = Log.length ()-Tracker.getpos (); if (Skipamt > 0) {if (log.isdebugenabled ()) {Log.debug ("skipping" + Skipamt + "bytes at t He end "+" of "edit log" + getName () + "': reached Txid" + Txid +"Out of" + Lasttxid); } tracker.clearlimit (); Ioutils.skipfully (Tracker, Skipamt); }}} break; Case CLOSED://If the CLOSED state is closed, a null break is returned directly; return null} return op; }
The approximate processing logic for the Nextopimpl () method is as follows:
Determine the status of the input stream according to the edit log file:
1, if the uninitialized state uninit, call the init () method to initialize, and then detect the edit log file input stream state, at this time should not be uninit, and finally call the Nextopimpl () method;
2, if the open state is opened, call Fseditlogop.reader's Readop () method, read the operator op;
3. If the closed state is closed, NULL is returned directly.
We focus on the following Fseditlogop.reader's Readop () method, the code is as follows:
/** * Read An operation from the input stream. * Note that the objects returned from this method is re-used by a future * calls to the same method. * * @param skipbrokenedits If True, attempt to skip over damaged parts of * the input stream, rather than thro Wing an ioexception * @return The operation read from the stream, or null at the end of the * file * @throws IOException On Error. This function should only the throw an * exception if Skipbrokenedits is false. */Public Fseditlogop Readop (Boolean skipbrokenedits) throws IOException {while (true) {try { Call the Decodeop () method return Decodeop (); } catch (IOException e) {in.reset (); if (!skipbrokenedits) {throw e; }} catch (RuntimeException e) {//Fseditlogop#decodeop is not supposed to throw runtimeexception. However, we handle it here for Recovery mode, just to is more//robust. In.reset (); if (!skipbrokenedits) {throw e; }} catch (Throwable e) {in.reset (); if (!skipbrokenedits) {throw new IOException ("Got unexpected Exception" + E.getmessage (), E); }}//Move ahead one byte and re-try the decode process. if (In.skip (1) < 1) {return null; } } }
Continue to follow the Decodeop () method with the following code:
/** * Read An opcode from the input stream. * Reads an operator from the input stream code * * @return The opcode, or null on EOF. * * If An exception is thrown, the stream's mark would be set to the first * problematic byte. This usually means the beginning of the opcode. */Private Fseditlogop Decodeop () throws IOException {Limiter.setlimit (maxopsize); In.mark (maxopsize); if (checksum! = null) {Checksum.reset (); } byte Opcodebyte; try {//reads a byte from the input stream in, i.e. opcodebyte Opcodebyte = In.readbyte (); } catch (Eofexception EOF) {//EOF at a opcode boundary is expected. return null; }//Convert byte type Opcodebyte to Fseditlogopcodes object OpCode fseditlogopcodes opCode = Fseditlogopcodes.frombyte (opcodebyt e); if (OpCode = = op_invalid) {verifyterminator (); return null; }//Obtain Fseditlogop object from cache according to Fseditlogopcodes object OpCode op fseditlogop op = cache.get (OpCode); if (OP = = null) {throw new IOException ("Read invalid opcode" + opcode); }//If the edit log length is supported, an int is read from the input stream, if (supporteditloglength) {in.readint (); } if (Namenodelayoutversion.supports (LayoutVersion.Feature.STORED_TXIDS, logversion)) {//Read the TXID//If the transaction ID is supported, read in a long, as the transaction ID, and set the transaction ID Op.settransactionid (In.readlong ()) in the Fseditlogop instance op; } else {//If the transaction ID is not supported, set the transaction ID to -12345 Op.settransactionid (HDFSCONSTANTS.INVALID_TXID) in the Fseditlogop instance op; }//Read into other domains from input stream in, and set into Fseditlogop instance op op.readfields (in, logversion); Validatechecksum (in, checksum, OP.TXID); return op; }
The logic of the Decodeop () method is simple:
1, read a byte from the input stream in, that is opcodebyte, determine the type of operation;
2, convert byte type opcodebyte to fseditlogopcodes object opcode;
3. Obtain the Fseditlogop object op from the cache according to the fseditlogopcodes object opcode, so that we get the operator object;
4. If the editing log length is supported, an int is read from the input stream;
5, if the transaction ID is supported, read into a long, as the transaction ID, and set the transaction ID in the FSEDITLOGOP instance OP, otherwise set the transaction ID to 12345 in the Fseditlogop instance op;
6. Call the ReadFields () method of the Operator object OP, read into the other domain from the input stream in, and set into the Fseditlogop instance op.
Next, let's look at the ReadFields () method of the Operator object, because different kinds of operators certainly contain different properties, so their readfields () methods are certainly not the same. In the following, we use the operator Addcloseop as an example to analyze its readfields () method as follows:
@Override void ReadFields (datainputstream in, int logversion) throws IOException {//Read length: If read-in length is supported , read an int from the input stream in, and assign the value to length if (! Namenodelayoutversion.supports (LayoutVersion.Feature.EDITLOG_OP_OPTIMIZATION, logversion)) {This.length = In.readint (); }//Read Node ID: If the read-in node ID is supported, read a long from the input stream in, and assign the value to Inodeid, otherwise inodeid defaults to 0 if (Namenodelayoutversion.supports ( LayoutVersion.Feature.ADD_INODE_ID, logversion)) {This.inodeid = In.readlong (); } else {//The Inodeid should be updated if this editlogop is applied This.inodeid = Inodeid.grandfather_ inode_id; }//Version compatibility check if (( -17 < logversion && length! = 4) | | (logversion <= -17 && Length! = 5 &&!) Namenodelayoutversion.supports (LayoutVersion.Feature.EDITLOG_OP_OPTIMIZATION, logversion)) {throw n EW IOException ("Incorrect data format.") + "LogVersion is ' + logversion + ' but writables.length ' + length + ". "); }//Read path: Reads a string from the input stream in, assigns a value to path This.path = fsimageserialization.readstring (in); Reads the number of replicas, modified time: If the number of read replicas, modified time, respectively, from the input stream read a short, long,//Assigned to replication, mtime if (Namenodelayoutversion.supports ( LayoutVersion.Feature.EDITLOG_OP_OPTIMIZATION, logversion)) {this.replication = Fsimageserialization.reads Hort (in); This.mtime = Fsimageserialization.readlong (in); } else {this.replication = Readshort (in); This.mtime = Readlong (in); }//Read access time: If read access time is supported, read a long from the input stream, assign the value to Atime, otherwise atime default to 0 if (Namenodelayoutversion.supports (layoutvers Ion. Feature.file_access_time, logversion)) {if (Namenodelayoutversion.supports (LayoutVersion.Feature.EDITL Og_op_optimization, logversion)) {this.atime = Fsimageserialization.readlong (in); } else { This.atime = Readlong (in); }} else {this.atime = 0; }//Read block size: If the block size is supported, read a long from the input stream and assign to BlockSize if (Namenodelayoutversion.supports (layoutversion.fe Ature. Editlog_op_optimization, logversion)) {this.blocksize = Fsimageserialization.readlong (in); } else {this.blocksize = Readlong (in); }//Call the Readblocks () method to read the data block, assign a value to the data block array blocks this.blocks = Readblocks (in, logversion); Read-in permission from the input stream, assign to Permissions this.permissions = Permissionstatus.read (in); If it is an ADD operation, additional processing is required for client name ClientName, client machine clientmachine, overwrite write flag overwrite, and so on if (This.opcode = = Op_add) {aclentries = Acleditlogutil.read (in, logversion); This.xattrs = Readxattrsfromeditlog (in, logversion); This.clientname = Fsimageserialization.readstring (in); This.clientmachine = Fsimageserialization.readstring (in); if (Namenodelayoutversion.supports (NameNodeLayoutVersion.Feature.CREATE_OVERWRITE, logversion)) {this.overwrite = Fsimageserialization.readboolean (in); } else {this.overwrite = false; } if (Namenodelayoutversion.supports (NameNodeLayoutVersion.Feature.BLOCK_STORAGE_POLICY, logversion)) { This.storagepolicyid = Fsimageserialization.readbyte (in); } else {This.storagepolicyid = blockstoragepolicysuite.id_unspecified; }//Read clientId and Callid readrpcids (in, logversion); } else {this.clientname = ""; This.clientmachine = ""; } }
There is nothing particularly good about this, in order to read the operator is required, in the input stream in order to exist in the property.
However, we still need to focus on the Readblocks () method of reading the data block, the code is as follows:
private static block[] Readblocks (datainputstream in, int logversion) throws IOException {// Read block number numblocks, accounting for an int int numblocks = In.readint (); Verify the number of blocks numblocks, which should be greater than or equal to 0, less than or equal to 1024x768 * 1024x768 * if (Numblocks < 0) {throw new IOException ("Invalid Negat Ive number of blocks "); } else if (Numblocks > Max_blocks) {throw new IOException ("Invalid number of BLOCKS:" + numblocks + ". The maximum number of blocks per file is "+ max_blocks); }//construct Block array blocks, size is numblocks block[] blocks = new Block[numblocks]; Reads numblocks blocks of data from the input stream for (int i = 0; i < numblocks; i++) {//Construction blocks block Instance blk block blk = new Blo CK (); Call Block's ReadFields () method to read the data Block Blk.readfields (in) from the input stream; Place data block blk into data block array blocks blocks[i] = blk; }//Return data block array blocks return blocks; }
Very simple, first read the block number numblocks from the input stream, determine the total number of data blocks to read, and then construct the block array blocks, the size is numblocks, and finally read from the input stream numblocks data block, Each time you construct the block instance blk, call Block's ReadFields () method, read the data block from the input stream, and then place the data block blk into the block array blocks. After all blocks have been read, the data block array blocks is returned.
Let's look at the Block's ReadFields () method, as follows:
@Override//Writable public void ReadFields (Datainput.) throws IOException { readhelper (in); }
Continue to see the Readhelper () method, as follows:
final void Readhelper (Datainput in) throws IOException { //read a long from the input stream as a data block Eddie Blockidthis.blockid = In.readlong () ;//Read a long from the input stream, as the chunk size numbytes this.numbytes = In.readlong (); Reads a long from the input stream as the timestamp of the data block Generationstamp This.generationstamp = In.readlong (); Checksum: Data block size numbytes should be greater than or equal to 0 if (numbytes < 0) { throw new IOException ("Unexpected block size:" + numbytes); } }
Read the data block sequentially from the input stream add blockid, data block size numbytes, the time stamp generated by the data block Generationstamp, all three are a long type.
Summarize
HDFs Source code Analysis Editlog Read operator