Create fsimage and edits source code analysis for Hadoop-2.4.1 Learning

Source: Internet
Author: User

In hadoop, fsimage stores the latest checkpoint information, and edits stores changes in the namespace after the latest checkpoint. When analyzing the source code of HDFS namenode-format, the fsimage and edits files are created based on the configuration file information. This article analyzes the source code of fsimage and edits files. The following code is available in the format method of namenode:

FSImage fsImage = new FSImage(conf, nameDirsToFormat, editDirsToFormat);    try {      FSNamesystem fsn = new FSNamesystem(conf, fsImage);      fsImage.getEditLog().initJournalsForWrite();      if (!fsImage.confirmFormat(force, isInteractive)) {        return true; // aborted      }      fsImage.format(fsn, clusterId);    } catch (IOException ioe) {      LOG.warn("Encountered exception during format: ", ioe);      fsImage.close();      throw ioe;    }

This code mainly involves three classes: fsimage, fsnamesystem, and fseditlog. fsimage is responsible for the checkpoint, and fseditlog maintains the logs of namespace changes, fsnamesystem performs actual bookkeeping for datanode. The source code for creating a fsimage object is

/*** Construct the fsimage. set the default checkpoint directories. ** setup storage and initialize the edit log. ** @ Param CONF configuration * @ Param imagedirs directories the image can be stored in. * @ Param editsdirs Directories The editlog can be stored in. * @ throws ioexception if directories are invalid. */protected fsimage (configuration Conf, collection <URI> imagedirs, list <URI> editsdir S) throws ioexception {This. conf = conf;/* nnstorage is responsible for managing storagedirectories */storage = new nnstorage (Conf, imagedirs, editsdirs) used by namenode;/* According to DFS. namenode. name. dir. the restore value determines whether to re-store the failed storage directory * default value: false */If (Conf. getboolean (dfsconfigkeys. dfs_namenode_name_dir_restore_key, dfsconfigkeys. dfs_namenode_name_dir_restore_default) {storage. setrestorefailedstorage (true);} This. editlog = new fseditlog (Co NF, storage, editsdirs);/* nnstorageretentionmanager is responsible for checking the storage directory of namenode, * and executing a retention policy on the fsimage and edits files. */Archivalmanager = new nnstorageretentionmanager (Conf, storage, editlog );}

In the fsimage constructor, The nnstorage, fseditlog, and nnstorageretentionmanager objects are created. The source code of the constructor is as follows:

public NNStorage(Configuration conf, Collection<URI> imageDirs, Collection<URI> editsDirs) throws IOException {    super(NodeType.NAME_NODE);    storageDirs = new CopyOnWriteArrayList<StorageDirectory>();        // this may modify the editsDirs, so copy before passing in    setStorageDirectories(imageDirs,                           Lists.newArrayList(editsDirs),                          FSNamesystem.getSharedEditsDirs(conf));  }

The setstoragedirectories method in nnstorage is used to initialize the directory for storing fsimage and edits files. The source code of this method is not analyzed here. It mainly analyzes the initialization of fsimage, as follows:

// Add all name dirs with appropriate NameNodeDirType    for (URI dirName : fsNameDirs) {      checkSchemeConsistency(dirName);      boolean isAlsoEdits = false;      for (URI editsDirName : fsEditsDirs) {        if (editsDirName.compareTo(dirName) == 0) {          isAlsoEdits = true;          fsEditsDirs.remove(editsDirName);          break;        }      }      NameNodeDirType dirType = (isAlsoEdits) ?                          NameNodeDirType.IMAGE_AND_EDITS :                          NameNodeDirType.IMAGE;      // Add to the list of storage directories, only if the      // URI is of type file://      if(dirName.getScheme().compareTo("file") == 0) {        this.addStorageDir(new StorageDirectory(new File(dirName.getPath()),            dirType,            sharedEditsDirs.contains(dirName))); // Don't lock the dir if it's shared.      }    }

In this section of code, fsnamedirs and fseditsdirs are the directories set by the DFS. namenode. Name. dir and DFS. namenode. edits. dir parameters respectively. By default, the two are directed to the same directory. If the values are the same, the fsimage type is namenodedirtype. image_and_edits; otherwise, namenodedirtype. image. Note that, although the file object indicating the corresponding directory has been created based on the configuration file, the actual directory has not been created on the local file system. Return to the fsiamge constructor. After the nnstorage object is created, create the fseditlog object. The constructor of this class is as follows:

/*** Constructor for fseditlog. underlying journals are constructed, but * No streams are opened until open () is called. ** @ Param conf the namenode configuration * @ Param storage object used by namenode * @ Param editsdirs list of journals to use */fseditlog (configuration Conf, nnstorage storage, list <URI> editsdirs) {issyncrunning = false; this. conf = conf; this. storage = storage; // In the format stage, the metrics value is null metrics = namenode. getnamenodemetrics (); lastprinttime = now (); // if this list is empty, an error will be thrown on first use // Of The editlog, as no journals will exist this. editsdirs = lists. newarraylist (editsdirs); this. includeditsdirs = fsnamesystem. getincludeditsdirs (CONF );}

After the fseditlog and nnstorageretentionmanager objects are created, the fsiamge constructor is executed and returned to the namenode format method. The next step is to create fsnamesystem Based on the fsimage object. As mentioned above, fsnamesystem implements actual bookkeeping for datanode. The actual construction method of this class is as follows, where the value of ignoreretrycache is false:

/**   * Create an FSNamesystem associated with the specified image.   *    * Note that this does not load any data off of disk -- if you would   * like that behavior, use {@link #loadFromDisk(Configuration)}   *   * @param conf configuration   * @param fsImage The FSImage to associate with   * @param ignoreRetryCache Whether or not should ignore the retry cache setup   *                         step. For Secondary NN this should be set to true.   * @throws IOException on bad configuration   */  FSNamesystem(Configuration conf, FSImage fsImage, boolean ignoreRetryCache)      throws IOException

Because the constructor has a lot of code, it will not be pasted out. In this constructor, objects such as blockmanager, datanodestatistics, fsdirectory, and cachemanager are instantiated and some information in the configuration file is read (this class will be studied later ). After the fsnamesystem object is created, the initjournalsforwrite () method of fseditlog is executed. The code for this method is as follows:

private State state = State.UNINITIALIZED;public synchronized void initJournalsForWrite() {    Preconditions.checkState(state == State.UNINITIALIZED ||        state == State.CLOSED, "Unexpected state: %s", state);    initJournals(this.editsDirs);    state = State.BETWEEN_LOG_SEGMENTS;  }

In this method, first check the log status. When fseditlog was just created, the status is uninitialized. After the initjournals () method is executed, the log status is between_log_segments, meaning that the log has not been opened. The log Initialization is completed by initjournals (). The code for this method is as follows:

Private synchronized void initjournals (list <URI> dirs) {int minimumredundantjournals = Conf. getint (dfsconfigkeys. dfs_namenode_edits_dir_minimum_key, dfsconfigkeys. dfs_namenode_edits_dir_minimum_default); // The log collection object journalset = new journalset (minimumredundantjournals); For (uri u: dirs) {Boolean required = fsnamesystem. getrequirednamespaceeditsdirs (CONF ). contains (U); If (U. getscheme (). equals (n Nstorage. local_uri_scheme) {storagedirectory SD = storage. getstoragedirectory (U); If (SD! = NULL) {journalset. add (New filejournalmanager (Conf, SD, storage), required, includeditsdirs. contains (u) ;}} else {journalset. add (createjournal (u), required, includeditsdirs. contains (u) ;}} if (journalset. isempty () {log. error ("no edits directories configured! ");}}

This method creates a filejournalmanager object based on the passed directory set and adds it to the journalset object. One of the filejournalmanager objects manages a directory that stores edits files. After the edits file is initialized, fsimage is formatted. The Code is as follows:

void format(FSNamesystem fsn, String clusterId) throws IOException {    long fileCount = fsn.getTotalFiles();    // Expect 1 file, which is the root inode    Preconditions.checkState(fileCount == 1,        "FSImage.format should be called with an uninitialized namesystem, has " +        fileCount + " files");    NamespaceInfo ns = NNStorage.newNamespaceInfo();    LOG.info("Allocated new BlockPoolId: " + ns.getBlockPoolID());    ns.clusterID = clusterId;        storage.format(ns);    editLog.formatNonFileJournals(ns);    saveFSImageInAllDirs(fsn, 0);  }

In this code, the code for creating the fsimage file is the savefsimageinalldirs (FSN, 0) in the last line. This method delegates the actual work to the following method:

saveFSImageInAllDirs(source, NameNodeFile.IMAGE, txid, null);

In this method, equal threads are created based on the number of directories in which the fsimage is saved. These threads complete the task of creating the fsimage. The old edits and checkpoint files are cleared.

The above analysis shows that only the fsimage file is created when HDFS namenode-format is executed, and no edits file is created, but related objects have been created. This can also be confirmed in the local file system after the format is executed. The directory contains only the fsimage file.

Create fsimage and edits source code analysis for Hadoop-2.4.1 Learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.