An in-depth analysis of HDFS

Source: Internet
Author: User

Guide The Hadoop Distributed File System (HDFS) is designed to be suitable for distributed file systems running on common hardware (commodity hardware). It has a lot in common with existing Distributed file systems. But at the same time, the difference between it and other distributed file systems is obvious. HDFs is a highly fault-tolerant system that is suitable for deployment on inexpensive machines.
I. Introduction to the background of HDFs

With the increasing amount of data, in an operating system jurisdiction of the scope of storage, then allocated to more operating system management of the disk, but not easy to manage and maintain, there is an urgent need for a system to manage the files on multiple machines, this is the Distributed file Management system.

The academic point is that a distributed file system is a system that allows files to be shared across multiple hosts over a network, allowing multiple users on multiple machines to share files and storage space. There are many distributed file management systems, and Hdfshdfs is just one of them. Applies to the case of one write, multiple queries, does not support concurrent write situations, small files are inappropriate. Because small files also occupy a block, the more small files (1000 1k files) The more blocks, the greater the namenode pressure.

Second, the basic concept of HDFs

The files we upload through the Hadoop shell are stored in the Datanode block, which is invisible to the Linux shell, and only blocks are visible to the block. You can describe HDFs in one sentence: The large files of the client are stored in the data blocks of many nodes . Here, there are three keywords: file, node, data block. HDFs is around the three key words designed, we are learning when it is important to seize the three key words to learn.

third, the NameNode of the basic structure of HDFs1. Role

The function of Namenode is to manage the file directory structure, accept the user's operation request, and manage the data node. The name node maintains two sets of data, one is the relationship between the file directory and the data block, and the other is the relationship between the data block and the node. The previous set of data is static, is stored on the disk, through the fsimage and edits files to maintain, the latter set of data is dynamic, not persisted to the disk, each time the cluster starts, it will automatically establish this information, it is generally placed in memory.

So he is the management node for the entire file system. It maintains a file directory tree for the entire file system, meta-information for the file/directory, and a list of data blocks for each file. Receives the user's action request.

Documents include:

①fsimage (File system image): Metadata image file. Stores Namenode memory metadata information for a certain period of time.

②edits: Operation log file.

③fstime: Time to save last checkpoint

These files are stored in the Linux file system

2. Features

<1> is a file system that allows files to be shared across multiple hosts over a network, allowing multiple users on multiple machines to share files and storage space.

<2> permeability. Let's actually access the file through the network action, by the program and the user, it is like accessing the local disk generally.

<3> fault tolerance. Even if some nodes are offline in the system, the system can continue to operate without any data loss as a whole.

<4> for a single write, multiple queries, not support concurrent write situations, small files inappropriate

3. Directory Structure

<1> since Namenode maintains so much information, where is this information stored?

In the Hadoop source code, there's a file called Hdfs-default.xml.

<2> Open this file

In lines 149th and 158th, there are two configuration information, one is Dfs.name.dir and the other is Dfs.name.edits.dir. These two files represent the storage location of Namenode's core files Fsimage and edits , as shown in

The value of the corresponding configuration has ${}, which is the representation of the variable, er expression, when the program reads the file, the value of the variable will be read out. So, the value of the variable Hadoop.tmp.dir in line 150th (that is, the Hadoop temporary storage path), as shown in.

But in our previous chapter, in the configuration file Core-site.xml, the value configured is/usr/local/hadoop/tmp.

<3> We can access the Linux file system

Execute the command cd/usr/local/hadoop/conf,more core-site.xml view as shown in

As you can see, these two files are stored in the/usr/local/hadoop/tmp/dfs/name directory of the Linux file system.

<4> we enter this directory

View the contents of this directory as shown in

It is known that the core files of the Namenode fsimage and edits are stored in the current directory, while the name directory has a file In_use.lock and view the contents of the time found that the content is empty, This means that only one namenode process can access the directory, and the reader can try it on its own, and when Hadoop is not turned on, there is no file In_use.lock, and the file is generated when Hadoop is started.

<5> file Fsimage is the core file of Namenode

This file is very important, if lost, namenode can not be used, then how to prevent the loss of the file and cause undesirable consequences. I can look at the code in Hdfs-default.xml again, as shown in

This variable, as described in the description, determines where the DFS NameNode nametable (fsimage) should be stored on the local file system. If this is a comma-delimited list of directories, then NameTable will be replicated to all directories for redundancy (backup to ensure data security). such as ${hadoop.tmp.dir}/dfs/name,~/name2,~/name3,~/name4. Then the fsimage will be copied to the ~/name1,~/name2,~/name3,~/name4 directory, respectively. So these directories are generally on different machines, different disks, different folders, in short, the more dispersed the better, so as to ensure the security of the data. Some people will ask how to implement on multiple machines? In fact, there is NFS file sharing system in Linux, not detailed here.

<6> take a look at edits's description

Look at the section of code in Hdfs-default.xml, as shown in

This variable, as described in the description, determines the location of the Dfsnamenode storage transaction file (edits) on the local file system. If this is a comma-delimited list of directories, then the transaction file will be duplicated in all directories to be redundant. The default value is Dfs.name.dir. (Edit Save Transaction Procedure)

Iv. DataNode of the basic structure of HDFs 1. Role

The role of Datanode is to actually store data in HDFs.

2. Block

<1> If a file is very large, such as 100GB, how is it stored in Datanode? Datanode data is stored in block-based reading and writing. Block is the basic unit of HDFs read and write data.

<2> Assuming the file size is 100GB, starting at byte position 0, each 64MB byte is divided into a block, and so on, can be divided into a lot of block. Each block is 64MB in size.

2.1 Let's take a look at the Org.apache.hadoop.hdfs.protocol.Block class

The attributes in this are shown in the following sections.

It is known that none of the properties in a class can store data. So block is essentially a logical concept, meaning that the block does not actually store the data, it just divides the files.

2.2 Why must be divided into 64MB size?

Since this is set in the default configuration file, we view the Core-default.xml file as shown in.

The parameter in the Ds.block.name refers to the size of the block, the value is 67 108 864 bytes, can be converted to 64MB. If we don't want to use a 64MB size, we can override that value in Core-site.xml. Note that units are bytes.

2.3 Copy

<1> copy is a backup, the purpose is for security. Because the cluster environment is unreliable, the replica mechanism is used to ensure the security of the data.

The drawback of <2> replicas is that they consume a lot of storage space. The more replicas, the more space they occupy. The cost of storage space is worthwhile compared to the risk of data loss.

<3> So, how many copies of a file are appropriate? We look at the Hdfs-default.xml file as shown in.

As you can see from Figure 4.3, the default number of replicas is 3. means that each chunk of data in HDFs has 3 copies. Of course, each one will definitely try to distribute it in a different Datanode server. Imagine: If the backup of the 3 data are on the same server, then the server is down, is not all the data are lost AH?

3. Directory Structure

3.1 Datanode is a file that is broken down by block.

So where exactly is the partitioned file stored? We look at the file Core-default.xml as shown in.

The value of the parameter Dfs.data.dir is the location of the block stored in the Linux file system. The value of the variable Hadoop.tmp.dir is described earlier and is/usr/local/hadoop/tmp, so the full path to Dfs.data.dir is/usr/local/hadoop/tmp/dfs/data. Viewed from the Linux command, as shown in result 4.5.

3.2 Uploading a file

We first click Pietty to open another Linux terminal, upload a file jdk-6u24-linux-i586.bin, file size is 84927175k, as shown in.

Then we can view the upload file in the original terminal, which is in the/usr/local/hadoop/tmp/dfs/data directory of the Linux file system, as shown in

The file that begins with "Blk_" is the block where the data is stored. The name here is regular, in addition to the block file, there is a suffix "meta" file, which is the block's source data file, storing some meta-data information. Therefore, there are only 2 block files.

Note: We're from Linux The disk uploads a full file to HDFs This file is in the Linux can be seen, but uploaded to HDFs , there will not be a corresponding file exists, but is divided into a lot of block existence. And since our Hadoop installation is a pseudo-distributed installation, with only one node, Datanode and Namenode on this node, the block blocks that are uploaded are ultimately in the Linux system.

v. The basic structure of HDFs Secondarynode

A solution for ha. But it does not support hot standby. Configuration. As more data operations edits file expansion, but can not allow him to expand indefinitely, so to convert the log process to put into the fsimage. Because Namenode to accept the user's operation request, must be able to quickly respond to the user request, in order to ensure the Namenode quick response to the user, so the work to Secondarynode, so he also backed up part of the fsimage portion of the content.

Execution process: Download metadata information (fsimage,edits) from Namenode, then merge the two, generate a new fsimage, save it locally, and push it to Namenode, Resets the edits of Namenode at the same time. The default is installed on the Namenode node, but this ... Not Safe!

The merging principle is as shown.

An in-depth analysis of HDFS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.