Namenode and Datanode in HDFs

Source: Internet
Author: User

The HDFs cluster operates in Master-slave mode, with two main types of nodes: one Namenode node (that is, master) and multiple Datanode nodes. Namenode manages the namespace of the file system. He maintains metadata for all files and folders in the file system tree and in the file tree .


HDFs Frame Composition:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/6E/94/wKiom1V_-OPSgEV_AATTZm5yVSc993.jpg "title=" 111. PNG "alt=" wkiom1v_-opsgev_aattzm5yvsc993.jpg "/>



Namenode:

Namenode manages the namespace of the file system. It maintains metadata for all files and folders in the file system tree and in the file tree (Metadata). There are two files that manage this information, namely the Namespace image file (Namespace image) and the Operation log file (edit log). This information is stored in RAM and, of course, these two files are also persisted on the local disk. Namenode records the location information of the data node where each block resides in each file, but it does not persist this information because it is rebuilt from data and nodes when the system restarts.


Namenode Structure Abstract Diagram:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/6E/94/wKiom1V_-PnxnGs8AAPoPz9H2zo240.jpg "title=" 222. PNG "alt=" wkiom1v_-pnxngs8aapopz9h2zo240.jpg "/>

The client interacts with Namenode and Datanode on behalf of the user to access the entire file system. The client provides a series of file system interfaces, so we can do what we need with little knowledge of datanode and Namenode when we program.


Datanode:

Datanode are the working nodes of the file system, they store and retrieve data based on the dispatch of the client or Namenode, and periodically send the Namenode a list of blocks that they store.


Namenode Fault Tolerant mechanism:

You can't work without a namenode,hdfs. In fact, if the machine running Namenode is broken, the files in the system will be completely lost, because there is no other way to reconstruct the file blocks on different datanode. Therefore, the fault tolerance mechanism of namenode is very important, and Hadoop provides two kinds of fault tolerance mechanisms.


The first way: The file system metadata backup that is stored on the local disk is persisted. Hadoop can be configured to let Namenode write its persisted state in different file systems. This write operation is synchronous and atomized. A more common configuration is to write the persisted state to the local disk, and also to the remote mounted network file system.


The second way: is to run an auxiliary Namenode (secondary Namenode). Secondary Namenode in real time cannot be used as Namenode it's primary role is to periodically namespace the image with the Operation log file (edit LOG) to prevent the operation of the logging file (edit log) from getting too large. Typically, the secondary Namenode runs on a separate physical machine because a backup of the namespace Mirror is merged, which can be used if the Namenode is down. But the auxiliary namenode always lags behind the namenode, so when the namenode goes down, data loss is unavoidable. In this case, in general, to use the Namenode metadata file in the remote mounted Network File System (NFS) mentioned in the first way, put the Namenode metadata file in NFS, Copy to the secondary Namenode and run the auxiliary namenode as a namenode.


This article is from the "David" blog, so be sure to keep this source http://davidbj.blog.51cto.com/4159484/1662449

Namenode and Datanode in HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.