Quick copy of HDFS data scheme: FastCopy

Source: Internet
Author: User

Objective

When we are using HDFS, sometimes we need to do some temporary data copy operation, if it is in the same cluster, we directly with the internal HDFS CP command, if it is cross-cluster or when the amount of data to be copied is very large size, We can also use the Distcp tool. But does this mean that we use these tools to still be efficient when copying data? That's not the answer, actually. In many companies that use Hadoop earlier, it is estimated that more or less the copying of large-scale data is inefficient. Facebook, for example, has developed a data-fast copy tool called FastCopy in its internal Hadoop version. There is also a corresponding record on Jira: HDFS-2139 (Fast copy for HDFS). The topic we're going to talk about in this section is the FastCopy tool.

Introduction to the principle of fastcopy

one of the major differences between fastcopy and traditional copies of data is that it minimizes data transfers across nodes by making it possible to place copies locally. And in the FastCopy local data copy process, but also by creating a new hard link to the file, without the need to do a real copy of the data operation . The relevant contents of the HDFs hard link can be read in one of my previous articles: HDFs Symbolic links and hard links. The hard links inside HDFs have already been implemented within Facebook.

Let's learn one of the main principles of the FastCopy quick Copy tool:

1) queries all block block information for the file to be copied.
2) Get the location information for these source file block information.
3) for each block of the source file, create an empty target block within the Namenode, where the storage location of these target blocks is as close as possible to the source block.
4) then command Datanode to do a local block copy operation.
5) Then wait for the block copy to complete the operation and escalate to the namenode.

For the 4th step in the above procedure, you can do this directly using a hard link. OK, the process here is a fastcopy tool to copy the data inside the process, then we see from a larger point of view, the FastCopy tool of the total flow is how it, the answer is as follows:

1. First enter the target path to be copied, which can be a pure file or directory.
2. The path entered in the first step is translated into a fastcopy request.
3. These requests are submitted to a thread pool for execution.
4. According to the source block in the copy process, the target block is located at the node location, respectively, the normal mode of datacopy or localcopy two copies of the local mode of operation.

This process works as shown in Figure 1-1.




Figure 1-1 fastcopy Data copy process

FastCopy principle part of the content mainly lies in the above 2 points. The above steps are detailed in the following key code analysis can be done to find the corresponding.

FastCopy Core Code Analysis

In the Core Code Analysis section, we will focus on the implementation of the 2 modules:

    • The FastCopy tool how to make the block replicate locally as much as possible.
    • How the FastCopy is executed when the data is copied.

First is the content of the first point: in the fastcopy, how to ensure that the block as far as possible in the local replication? To answer this question, let's go back to the quick copy principle of fastcopy, which has a process like this:

For each block of the source file, an empty target block is created within the Namenode, where the storage location of these target blocks is as close as possible to the source block.

When this step is performed, the location information of the source block to which the query is queried takes precedence as the location of the target creation block. This ensures that the target copy block and source are fast on the same node . Why does it indicate that it is a priority rather than an absolute affirmation? Because there is not enough storage space on the target disk, this location will not be available if the target storage disk has insufficient free space. Then Namenode will select the next storage location.

This section of code is as follows:

    /** * Copy the file. * @return Result of the operation * *    PrivateCopyresultCopy()throwsException {//Get source file information and prepare to create empty destination fileHdfsfilestatus srcfilestatus = srcnamenode.getfileinfo (src);if(Srcfilestatus = =NULL) {Throw NewFileNotFoundException ("File:"+ src +"does not exist"); } log.info ("Start to copy"+ src +"to"+ destination);Try{        ... Linkedlist<locatedblock> blockslist =NewLinkedlist<locatedblock> (); Locatedblock previousadded =NULL; do {laststart = Lastend;//Get the Block object for the source fileLocatedblocks blocks = srcnamenode.getblocklocations (SRC, laststart, addition); ... lastend = Lastblock.getstartoffset () + lastsize;//Traverse the list of block objects for this file            for(Locatedblock lb:blocks.getLocatedBlocks ()) {if(previousadded = =NULL|| !previousadded.getblock (). Equals (Lb.getblock ())) {//Add block objects to the block list at the endBlockslist.add (LB);             previousadded = lb; }           }         } while(Lastend < FileLen);         ... Enumsetwritable<createflag> flagwritable =NewEnumsetwritable<createflag> (flag);//Create a target file inside NamenodeHdfsfilestatus dstfilestatus = dstnamenode.create (destination, srcfilestatus.getpermission (), ClientName, flag Writable,true, Srcfilestatus.getreplication (), Srcfilestatus.getblocksize (), cryptoprotocolversion.supported ()) ;//Instruct each Datanode to create a copy of the respective block.         intblocksadded =0; Extendedblock previous =NULL; Locatedblock Destinationlocatedblock =NULL;//Loop through each block and create copies.         //Traversal before source file Block list          for(Locatedblock srclocatedblock:blockslist)           {Usergroupinformation.getcurrentuser (). Addtoken (Srclocatedblock.getblocktoken ()); string[] Favorednodes =NewString[srclocatedblock.getlocations (). length];//Get location information for the source file block            for(inti =0; I < Srclocatedblock.getlocations (). length; i++) {Favorednodes[i] = srclocatedblock.getlocations () [I].gethostname () +":"+ srclocatedblock.getlocations () [I].getxferport (); } log.info ("Favorednodes for"+ Srclocatedblock +":"+ arrays.tostring (favorednodes)); for(intSleeptime = -, retries =Ten; Retries >0; Retries-=1) {Try{//Namenode finally creates a new target block, passing in the location information of the previous source file block, as the preferred storage locationDestinationlocatedblock = Dstnamenode.addblock (destination, ClientName, Previous,NULL, Dstfilestatus.getfileid (), favorednodes); Break; }Catch(RemoteException e) {             ...           }if(Destinationlocatedblock = =NULL) {Throw NewIOException ("Get NULL located block from Namendoe"); } blocksadded++;//Copy real DataCopyblock (Srclocatedblock, Destinationlocatedblock);//wait for a copy of the dataWaitforblockcopy (blocksadded);         ...         } Terminateexecutor ();//Wait for all blocks of the file to be copied.Waitforfile (SRC, destination, Previous, Dstfilestatus.getfileid ()); }Catch(IOException e) {Log.error ("failed to copy src:"+ src +"DST:"+ Destination, E);//If an IO exception occurs during this process, the target file is clearedDstnamenode.delete (Destination,false);ThrowE }finally{shutdown (); }returncopyresult.success; }

Next we look at the content of the second key section, how to implement the fast copy on the Datanode node, assuming that the action of creating the block in front of Namenode is complete, and finally the block copy operation of the Datanode node is poor.

FastCopy complex request, the last trigger to datanode the corresponding method Copyblock method, the code is as follows:

    Public void Copyblock(Extendedblock src, extendedblock DST, Datanodeinfo Dstdn) throws IOException {...LongOndisklength = Data.getlength (src);Determine if the block is damaged by determining whether the length of the current source block is consistent before//copying     if(Src.getnumbytes () > Ondisklength) {//Shorter on-disk len indicates corruption so report NN the corrupt blockString msg ="Copyblock:can ' t replicate block"+ src +"because On-disk length"+ Ondisklength +"is shorter than provided length"+ src.getnumbytes (); Log.info (msg);Throw NewIOException (msg); } log.info (Getdatanodeinfo () +"copyblock:starting thread to transfer:"+"BLOCK:"+ src +"from"+ This. Getdatanodeuuid () +"to"+ dstdn.getdatanodeuuid () +"("+dstdn +")"); future<?> result;//Determine whether the node location of the target block is consistent with the source block node     if( This. Getdatanodeuuid (). Equals (Dstdn.getdatanodeuuid ())) {//If it is the same node, it is a local copyresult = Blockcopyexecutor.submit (NewLocalblockcopy (SRC, DST)); }Else{//Otherwise, it is a normal one-time copy of the dataresult = Blockcopyexecutor.submit (NewDatacopy (Dstdn, SRC, DST)); }Try{//wait for the copy process for 5 minutesResult.Get(5* -, timeunit.seconds); }Catch(Exception e) {Log.error (e);Throw NewIOException (e); }   }

From the above implementation process, we can see that the last is a copy of the 2 types of way: Localblockcopy and Datacopy.

The first is the local copy of the Localblockcopy, with the following code:

 class localblockcopy implements callable<Boolean> {     //Source block     PrivateExtendedblock Srcblock =NULL;//target block     PrivateExtendedblock Dstblock =NULL; ... PublicBoolean Call () throwsException{Try{dstblock.setnumbytes (Srcblock.getnumbytes ());//Create a new hard link to the source blockData.hardlinkoneblock (Srcblock, Dstblock); Fsvolumespi v = (FSVOLUMESPI) (Getfsdataset (). Getvolume (Dstblock));//Close block OperationCloseblock (Dstblock, Datanode.empty_del_hint, V.getstorageid ()); ...       }Catch(Exceptione) {Log.warn ("Local block copy for src:"+ srcblock.getblockname () +", DST:"+ dstblock.getblockname () +"Failed", e);ThrowE }return true; }}

There is also a normal way of copying, there will be data transfer between nodes, the code is as follows:

   Private  class datacopy implements Runnable {     //The node where the target block resides     FinalDatanodeinfo Target;//Source block     FinalExtendedblock src;//target block     FinalExtendedblock DST; ...@Override      Public void Run() {       ...Try{FinalString dnaddr = target.getxferaddr (connecttodnviahostname); Inetsocketaddress curtarget = netutils.createsocketaddr (DNADDR);if(Log.isdebugenabled ()) {Log.debug ("Connecting to Datanode"+ dnaddr); }//The process of establishing a connection to the target node firstSock = Newsocket ();         Netutils.connect (sock, Curtarget, dnconf.sockettimeout);         Sock.setsotimeout (dnconf.sockettimeout); ...LongWriteTimeout = Dnconf.socketwritetimeout;        OutputStream unbufout = Netutils.getoutputstream (sock, writetimeout);         InputStream Unbufin = Netutils.getinputstream (sock);         Dataencryptionkeyfactory keyfactory = Getdataencryptionkeyfactoryforblock (DST);         Iostreampair saslstreams = saslclient.socketsend (sock, Unbufout, Unbufin, Keyfactory, Accesstoken, BpReg);         Unbufout = Saslstreams.out; Unbufin = saslstreams.in;//New input, output stream objectout =NewDataOutputStream (NewBufferedoutputstream (Unbufout, hdfsconstants.small_buffer_size)); in =NewDataInputStream (Unbufin); Blocksender =NewBlocksender (SRC,0, Src.getnumbytes (),false,false,true, DataNode. This,NULL, cachingstrategy); Datanodeinfo Srcnode =NewDatanodeinfo (Bpreg);//Perform write block operation         NewSender (out). WriteBlock (DST, Storagetype.default, Accesstoken,"",NewDatanodeinfo[] {target},NewStoragetype[] {Storagetype.default}, Srcnode, Blockconstructionstage.pipeline_setup_create,0,0,0,0, Blocksender.getchecksum (), Cachingstrategy,false,false,NULL);//Read local data with Blocksender object and transfer data to target nodeBlocksender.sendblock (out, Unbufout,NULL); ...       }Catch(IOException IE) {Log.warn (Bpreg +": Failed to transfer"+ src +"to"+ Target +" "+ DST +"Got", IE);//Check if there is any disk problemCheckdiskerrorasync (); }finally{//Close individual objectsXmitsinprogress.getanddecrement ();         Ioutils.closestream (Blocksender);         Ioutils.closestream (out);         Ioutils.closestream (in);       Ioutils.closesocket (sock); }     }   }

If the previous local copy of the hard link function is not used, I think this datacopy way can also be reused.

The above code is only part of the FastCopy tool code, and the detailed code can look at the resources at the end of this article.

The above is the whole content of this article, I hope you can have a good experience fastcopy is how to make the block as far as possible in the local copy, which is a very central point.

Resources

[1]. Fast copy for HDFS
[2].https://issues.apache.org/jira/secure/attachment/12784877/hdfs-2139-for-2.7.1.patch

Quick copy of HDFS data scheme: FastCopy

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.