HDFs Learning Experience

Source: Internet
Author: User

Hdfs-hadoop File SystemSection One: The file structure of HDFs

Learning HDFs first needs to understand the file structure of HDFs, and how it updates and saves the data, to understand HDFs first to know that HDFs is mainly composed of three parts: Namenode,datanode,secondarynamenode.

The relationship of the three is according to my understanding is the master, servant, small secret relationship. Namenode is the master's role, he is a manager, is a decision-maker. Datanode is a servant, accurate is a bunch of people, their work is according to the master's orders to do their job, always wave upon wave role. Secondaryamenode is a Xiaomi, the owner of a large pile of people with a specific contract to maintain, Xiaomi's character is in a certain amount of time or instruments to accumulate to a certain extent, help the host to organize these instruments, to maintain the most efficient introduction.

1.1 Namenode

Namenode as the head of HDFs, his main job is to accept client requests, reasonably read and partition data and store it, while Namenode data storage is divided into two parts: memory, HDD.

The data of the memory mainly is some metadata information, the metadata information is like an index information, through the index can easily find the location of the requirement data, including the copy location; The metadata exists primarily to facilitate the reading of data in HDFs.

There is more data on the hard disk, and the latest formatted Namenode will generate the following file directory structure: ${dfs.name.dir}/current/version

/edits

                                                                                                                        /fsimage

/fstime

These four files and folders. The version file contains some basic information about the Namenode, and there is a namespaceid, which is the only sign of this namenode, and this represents the need for all Datanode IDs to remain consistent, normally running HDFS, The IDs of Namenode and Datanode are exactly the same.

So the remaining three files:

Edits is the log file for HDFs, which records some read and write operations on the Namenode. This is the first important information stored in the Namenode, which records all recent operational records as well as the operation status and operation contents; Edits is very important, he is an important file to do Namenode data synchronization.

Fsimage is a mirrored file on the hard disk where the Namenode exists in-memory metadata, but the image file is not synchronized with Matadata (metadata), and the update operation is performed to maintain consistency with the in-memory metadata information when certain conditions are reached Fsimage The credential that performs this mirroring synchronization operation is edits.

Fstime, image generation or modification time, no more speaking.

Namenode mainly involves the reading and writing operations, which are relatively simple (relative), writing will have a process

Namenode write operation, the first client to request a write operation, Namenode will find suitable for storing data and data copy of the memory location, and then the specific data stored in the log file, and then the specific information to the client, The client gives the data to Datanode according to the Namenode rules to hold the data, when the write succeeds, but the success code has not yet returned to the client before refreshing and synchronizing the log file, actually adds a status bit. The Namenode then generates metadata (metadata) and generates a fsimage after the client returns successfully.

And in that door oh work, secondaryname as a vassal, in fact it has been working for, his work is to solve the disharmony between Matadata and fsimage (inconsistent), Here for God Horse will use to Secondarynamenode to help Namenode to manage Namenode, because Namenode constantly write, will produce a lot of logs, if Namenode restart, then load these log files will consume a lot of time, After the use of secondarynamenode processing over edits and faimage, the size of the edits is always a relatively small level, then naemnode even if the restart can be started quickly and maintain the previous state.

About Secondarynamenode How to update fsimage and edits

(1) Senode first gets the fsimage and edits in Namenode via HTTP.

(2) Senode reads fsimage into memory, then executes all operations in edits and creates a new Fsimage file.

(3) Senode sends new fsimage to Namenode via HTTP

(4) The new fsimage in Namenode replaces the old fsimage and then edits the file and updates fstime.

1.2 Secondary NameNode

Secondarynamenode's file directory is exactly the same as that one, except that the root directory of its configuration file may be different, depending on the configuration in the Hdfs-.xml.

Secondarynamenode in the Help Namenode also in their own local save the latest Namenode edits descendants fsimage file, his role is to help Namenode and when Namenode hang out, Help restore the Namenode configuration.

(1) Copy all local files (data in Secondarynamenode: Edits,fsimage,fstime) directly to the new Namenode.

(2) Use Secongarynamenode as a new namenode.


HDFs Learning Experience

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.