"Hadoop" HDFs three components: NameNode, Secondarynamenode, and Datanode

Source: Internet
Author: User
Tags time interval


HDFs consists primarily of three components, Namenode, Secondarynamenode, and Datanode, where Namenode and Secondarynamenode run on the master node, The Datanode runs on the slave node.



The HDFS architecture is shown below:







1. NameNode



Namenode manages the namespace of the HDFs file system, which maintains the file system tree and all files and directories in the tree. Namenode also handles the opening, closing, moving, and renaming of these files and directories. The actual file data is operated by Datanode.



When the client initiates the request, the request first arrives at the Namenode,namenode analysis request and then tells the client which datanode to find the data block on. The client that receives the message interacts directly with the Datanode.



The types of meta data in Namenode are:



(1) The filename directory and their hierarchical relationship, (2) The owner of the file directory and their permissions, (3) the name of each file block and what blocks the file is composed of.



It is important to note that the metadata information saved by Namenode does not contain the location information for each block, only the name of the block and which blocks the file consists of. The location information of the block is obtained from Datanode at each restart of the Namenode, and Namenode maintains communication through the heartbeat mechanism and datanode, monitoring the file system in real-time for normal operation.



2. DataNode



The Datanode is run on the slave node, also known as the work node. It is responsible for the storage of data blocks, but also for the client side to provide read and write services, but also receive namenode instructions, create, delete and copy operations. Datanode also periodically sends the stored File block list information to Namenode through the heartbeat mechanism. And Datanode also communicates with other datanode nodes, and the copy data block has achieved the purpose of redundancy.



3. Secondarynamenode



Namenode metadata information is stored in Fsimage, Namenode will be read into memory after each reboot, and in order to prevent data loss during the run, the Namenode operation will be continuously written to the local Editlog file.



When the checkpoint is triggered, Fsimage will apply the action in the Editlog file again, then write the new version of Fsimage back to disk and delete the old transaction information from the Editlog file. Checkpoints have two triggering mechanisms: (1) The time interval is triggered in seconds (Dfs.namenode.checkpoint.period), and (2) the transaction value trigger (DFS.NAMENODE.CHECKPOINT.TXNS) that reaches the file system accumulation.



The merging of Fsimage and Editlog files uses the Secondarynamenode component, which works as follows:



(1) before merging, notify Namenode to write all operations to the new Editlog file and name it edits.new;



(2) Secondarynamenode the merger of Fsimage and Editlog from the Namenode office;



(3) Secondarynamenode the merger of Fsimage and Editlog into new fsimage documents;



(4) Namenode gets the merged new Fsimage from the Secondarynamenode and replaces the old ones, and replaces the edits.new created in the Editlog with (1).



(5) Update the checkpoints in the fstime.



In conclusion:



(1) Fsimage: The metadata information of HDFs that was saved is the last checkpoint;



(2) Editlog: Saved is the HDFS metadata information state change information that occurred from the last checkpoint;



(3) Fstime: The timestamp of the last checkpoint was saved.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.