After hadoop HDFS is deployed, it cannot be used immediately. Instead, you must format the configured file system. Pay attention to the two concepts here. One is a file system. The file system does not exist physically at this time. It may be more appropriate to describe the network disk. The other is formatting, the format here does not refer to the local disk formatting in the traditional sense, but to some cleanup and preparation work. This article will focus on formatting on namenode nodes.
As we all know, namenode is mainly used to manage metadata of the namespace (actually directories and files) of the entire distributed file system. To ensure data reliability, operation logs are also added, therefore, namenode will persist the data (stored in a local file system ). If you use HDFS for the first time, you must run the-format command before starting the namenode node service. So what exactly does the fromat command of namenode do?
On the namenode node, there are two most important paths used to store metadata information and operation logs. These two paths are from the configuration file and their corresponding attributes are DFS. name. dir and DFS. name. edits. at the same time, their default paths are/tmp/hadoop/dfs/name. During formatting, namenode clears all files under the two directories, and then creates files under the DFS. Name. dir directory:
[Plain]View plaincopy
- {DFS. Name. dir}/current/fsimage
- {DFS. Name. dir}/current/fstime
- {DFS. Name. dir}/current/version
- {DFS. Name. dir}/image/fsimage
The file will be created under the DFS. Name. edits. dir directory:
[Plain]View plaincopy
- {DFS. Name. edits. dir}/current/edits
- {DFS. Name. edits. dir}/current/fstime
- {DFS. Name. edits. dir}/current/version
- {DFS. Name. edits. dir}/image/fsimage
So what are these files used?
Before introducing the purpose of this file, we can. name. dir and DFS. name. edits. DIR is configured to the same directory. After namenode is formatted, the following file will be generated: {DFS. name. dir}/current/fsimage, {DFS. name. dir}/current/edits, {DFS. name. dir}/current/fstime, {DFS. name. dir}/current/version, {DFS. name. dir}/image/fsimage. It can be seen that the files with the same name are actually the same. name. dir and DFS. name. edits. DIR is configured with the same value to improve the efficiency of namenode. OK. Now let's focus on the usage of these files.
Fsimage: stores metadata of namespaces (actually directories and files). The file structure is as follows:
Edits is used to store the log information for namespace operations and restore namenode nodes;
Fstime: used to store the last check point time of metadata;
Version: used to store namenode version information. namespace ID (Version Number). The content is as follows:
/Image/fsimage: The/current/fsimage file before the last submission;