Create fsimage and edits source code analysis for Hadoop-2.4.1 Learning
In Hadoop, fsimage stores the latest checkpoint information, and edits stores changes in the namespace after the latest checkpoint. When analyzing the source code of hdfs namenode-format, the fsimage and edits files are created based on the configuration file information. This article analyzes the source code of fsimage and edits files. The following code is available in the format method of NameNode:
FSImage fsImage = new FSImage (conf, nameDirsToFormat, editDirsToFormat );
Try {
FSNamesystem fsn = new FSNamesystem (conf, fsImage );
FsImage. getEditLog (). initJournalsForWrite ();
If (! FsImage. confirmFormat (force, isInteractive )){
Return true; // aborted
}
FsImage. format (fsn, clusterId );
} Catch (IOException ioe ){
LOG. warn ("Encountered exception during format:", ioe );
FsImage. close ();
Throw ioe;
}
This code mainly involves three classes: FSImage, FSNamesystem, and FSEditLog. FSImage is responsible for the checkpoint, and FSEditLog maintains the logs of namespace changes, FSNamesystem performs actual bookkeeping for DataNode. The source code for creating a fsImage object is
/**
* Construct the FSImage. Set the default checkpoint directories.
*
* Setup storage and initialize the edit log.
*
* @ Param conf Configuration
* @ Param imageDirs Directories the image can be stored in.
* @ Param editsDirs Directories the editlog can be stored in.
* @ Throws IOException if directories are invalid.
*/
Protected FSImage (Configuration conf, Collection <URI> imageDirs,
List <URI> editsDirs) throws IOException {
This. conf = conf;
/* NNStorage manages StorageDirectories used by NameNode */
Storage = new NNStorage (conf, imageDirs, editsDirs );
/* Determine whether to re-store the failed storage directory based on the value of dfs. namenode. name. dir. restore.
* The default value is false.
*/
If (conf. getBoolean (DFSConfigKeys. DFS_NAMENODE_NAME_DIR_RESTORE_KEY,
DFSConfigKeys. DFS_NAMENODE_NAME_DIR_RESTORE_DEFAULT )){
Storage. setRestoreFailedStorage (true );
}
This. editLog = new FSEditLog (conf, storage, editsDirs );
/* NNStorageRetentionManager checks the NameNode storage directory,
* Execute the retention policy on the fsimage and edits files.
*/
ArchivalManager = new NNStorageRetentionManager (conf, storage, editLog );
}
In the FSImage constructor, The NNStorage, FSEditLog, and NNStorageRetentionManager objects are created. The source code of the constructor is as follows:
Public NNStorage (Configuration conf, Collection <URI> imageDirs, Collection <URI> editsDirs) throws IOException {
Super (NodeType. NAME_NODE );
StorageDirs = new CopyOnWriteArrayList <StorageDirectory> ();
// This may modify the editsDirs, so copy before passing in
SetStorageDirectories (imageDirs,
Lists. newArrayList (editsDirs ),
FSNamesystem. getincludeditsdirs (conf ));
}
The setStorageDirectories method in NNStorage is used to initialize the directory for storing fsimage and edits files. The source code of this method is not analyzed here. It mainly analyzes the initialization of fsimage, as follows:
// Add all name dirs with appropriate NameNodeDirType
For (URI dirName: fsNameDirs ){
CheckSchemeConsistency (dirName );
Boolean isAlsoEdits = false;
For (URI editsDirName: fsEditsDirs ){
If (editsDirName. compareTo (dirName) = 0 ){
IsAlsoEdits = true;
FsEditsDirs. remove (editsDirName );
Break;
}
}
NameNodeDirType dirType = (isAlsoEdits )?
NameNodeDirType. IMAGE_AND_EDITS:
NameNodeDirType. IMAGE;
// Add to the list of storage directories, only if
// URI is of type file ://
If (dirName. getScheme (). compareTo ("file") = 0 ){
This. addStorageDir (new StorageDirectory (new File (dirName. getPath ()),
DirType,
Includeditsdirs. contains (dirName); // Don't lock the dir if it's shared.
}
}
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)
For more details, please continue to read the highlights on the next page: