Dfsclient Technical Insider (write data--data write process)

Source: Internet
Author: User
The following is my research source code results, this article is dedicated to me and my small partners, shortcomings, welcome treatise-------------------------------------to Shedoug and other people.
Note: Hadoop version 0.20.2, there are children shoes to see the code dizziness, so this article uses a text-only description, the elder brother also deliberately for you to put the font color Oh ^ o ^
In a previous article, we discussed the establishment of a data pipeline, creating outputstream below we discuss the detailed process of data writing, which is slightly more complex than before.
=====================================================================
a more important passage:
Since Dfsoutputstream did not rewrite the write () method, but instead reused the implementation in its parent class Fsdoutputsummer abstract class, the write () method calls the Write1 () method.
Write the data as much as possible through Writechecksumchunk (), where the Write () 1 method is similar to the Fsinputchecker.read1 (), mainly to improve processing efficiency and avoid data copying. Fsdoutputsummer.writechecksumchunk () The data used to write out a check block includes raw data and checksum information, which invokes the Writechunk () method implemented by Dfsoutputstream.
--------------------------------------------------------------------------------------------------------------- ------------- 

//Write Data entry
Public synchronized void Write (byte b[], int off, int len) throws IOException {if (Off < 0 | | Len < 0 | | off     > B.length-len) {throw new arrayindexoutofboundsexception (); }
for (int n=0;n<len;n+= write1 (b, Off+n, len-n)) {     }   }
--------------------------------------------------------------------------------------------------------------- ---------------------------------
/** * Place the start position in B to be off, the data block with Len length is written to the file * */private int write1 (byte b[], int off, int len) throws IOException { if (count==0 && len>=buf.length) {//If there is no data in the BUF buffer, the length of the data being written is greater than the buffer size//It is called directly WRITECHECKSUMC       Hunk writes data, avoids external copy final int length = Buf.length; Sum.update (b, off, length); Writechecksumchunk (b, off, length, false);return length;     }//int bytestocopy = Buf.length-count; Bytestocopy = (len<bytestocopy)?     Len:bytestocopy;     Sum.update (b, off, bytestocopy);     Copy data from B to buf System.arraycopy (b, off, buf, Count, bytestocopy);     Count + = bytestocopy;       if (count = = buf.length) {//when the buffer is full, call Flushbuffer ();     Flushbuffer ();   } return bytestocopy; }
--------------------------------------------------------------------------------------------------------------- ---------------------------------
/** * First generates a checksum for the data block, and then passes the Writechunk Method writes data block checksum to the output stream*/private void Writechecksumchunk (byte b[], int off, int len, Boolean keep) throws IOException {int tempchecksum     = (int) sum.getvalue ();     if (!keep) {sum.reset (); } int2byte (tempchecksum, checksum); Writechunk (b, off, Len, checksum);}  -------------------------------------------------------------------------------------------------------------- ------------------------------------
   /**      *       */    @Override     protected synch ronized void Writechunk (byte[] b, int offset, int len, byte[] checksum)               &N Bsp                          ,         &NB Sp         throws IOException {      //First determine if the output stream to Datanode is open      & nbsp   Checkopen ();       isClosed ();               //Get the length to write to the storage checksum buffer       int cklen = Checksum.le Ngth;              //Get checksum data       int bytesperchecksum = This.checksum. Getbytesperchecksum ();                //If the number of bytes of data to be written is greater than the value of the length of the obtained checksum data, The corresponding exception is thrown       if (len > Bytesperchecksum) {    &NBsp   Throw new IOException ("Writechunk () buffer size is" + len +               &NBSP ;               "is larger than supported  bytesperchecksum" +     &NBSP ;                         bytesperchecksum);      }              //If the number of bytes of data to be written is greater than the length of the checksum specified by the system, the corresponding exception is thrown &N Bsp     if (checksum.length! = This.checksum.getChecksumSize ()) {        throw new IOException ( "Writechunk () checksum size is supposed to be" +                     &N Bsp         this.checksum.getChecksumSize () +                &NBSP ;               "but found to be" + checksum.length);      }
synchronized (dataqueue) {If the data queue and the acknowledgment queue have exceeded the maximum value that the buffer can hold (too many queues), then the write operation of the data will need to wait while (!closed && dataqueue.size () + ackqueue.size () & Gt           Maxpackets) {try {dataqueue.wait ();         } catch (Interruptedexception e) {}}//If the output stream is closed or has lastexception, it needs to be thrown lastexception                   IsClosed (); If the packet currently being written is empty, you need to create a new packet if (Currentpacket = = null) {
           //Very critical step: package data, package data several chunk into one packet  
           /initialization of a packet  
           currentpacket = new Packet (PacketSize, chunksperpacket,    &NBSP ;                                  byte Scurblock);           if (log.isdebugenabled ()) {            Log.debug ("Dfscli ENT writechunk allocating new packet seqno= "+                        Currentpacket.seqno +                       "src=" + src +                       "packetsize=" + packetsize +   &NBSP ;                   "chunksperpacket=" + chunksperpacket +     &NBS P                 ", bytescurblock=" + bytescurblock);          }        }                   //to write data and corresponding checksum to packet           Currentpacket . Writechecksum (checksum, 0, Cklen);          Currentpacket . WriteData (b, offset, len);Record the number of chunk in packet and the number of bytes written to data in the current block currentpacket.numchunks++; Bytescurblock + = Len;
        //If the packet is full, add it to the write queue                 if (c Urrentpacket.numchunks = = Currentpacket.maxchunks | |             Bytescurblock = = blockSize) {          if (log.isdebug Enabled ()) {            Log.debug ("dfsclient writechunk packet full seqno=" +                       Currentpacket.seqno +           & nbsp           ", src=" + src +                       ", bytescurblock=" + Bytescurblock +                       " , blocksize= "+ blockSize +                      ", appendchunk= " + AppendChunk);          }           /**           * If the current block is filled, the packet being written is marked as the last packet in block,         & nbsp * and reset the block's stats           */          if (Bytescurblock = = BlockSize ) {            Currentpacket.lastpacketinblock = true;           & nbsp Bytescurblock = 0;             Lastflushoffset =-1;          }            //The last packet is added to the data queue and the currently written packet is set to null &NB Sp         dataqueue.addlast (currentpacket);           Dataqueue.notifyall ();           currentpacket = null;             /**           * Set to not allow packet to be added to the current block, and then reset the data block buffer School           *           if (appendChunk) {    &nbsP       AppendChunk = false;              Resetchecksumchunk (bytesperchecksum);//todo STEP5//If you are adding data to the end of an open file, you need to call computepacketchunksize after the data is written to reset the chunk size
Writepacketsize packets can reach a maximum of 64K bytes int psize = math.min ((int) (blocksize-bytescurblock), writepacketsize);         Computepacketchunksize (Psize, bytesperchecksum);     }}//log.debug ("Dfsclient writechunk done Length" + Len +//"checksum length" + Cklen); }
--------------------------------------------------------------------------------------------------------------- -------------------------------
Below is an analysis of the next more important class packet packet, which belongs to the Dfsoutputstream inner class
//packet ConstructorsPacket (int pktsize, int chunksperpkt, long offsetinblock) {
Whether the packet is the last packet in the block this.lastpacketinblock = false;
Packet the number of chunk currently contained this.numchunks = 0;
Packet the number of chunk currently contained this.offsetinblock = Offsetinblock;
The serial number of the buffer in block this.seqno = currentseqno;
The sequence of packet that is currently being sent by Datastreamer in the entire data block currentseqno++; Buffer of data buffers = NULL;
BYTE buffer buf = new Byte[pktsize]; The starting position of the checksum is Checksumstart = Datanode.pkt_header_len + Size_of_integer;
Current position of the checksum checksumpos = Checksumstart;
The starting position of the data Datastart = Checksumstart + chunksperpkt * checksum.getchecksumsize ();
The current position of the data Datapos = Datastart;
Packet the maximum number of chunk that can be contained maxchunks = CHUNKSPERPKT; }
--------------------------------------------------------------------------------------------------------------- -------------------------------------
Write checksum information to the data packed
/** * Write checksum information to data packed * @param inarray * @param off * @param len */void Write Checksum (byte[] inarray, int off, int len) {if (Checksumpos + len > Datastart) {throw new Bufferove         Rflowexception ();         } system.arraycopy (inarray, Off, buf, Checksumpos, Len);       Checksumpos + = Len; }
writing data information to the data packed
/**        * writing data information to data packet        * @param inarray         * @param off        * @param len        */      void WriteData (byt E[] inarray, int off, int len) {        if (Datapos + len > Buf.length) {      &N Bsp   throw new Bufferoverflowexception ();        }

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.