Hadoop diary day5 --- in-depth analysis of HDFS

Source: Internet
Author: User

This article uses the hadoop Source Code. For details about how to import the hadoop source code to eclipse, refer to the first phase.

I. background of HDFS

As the amount of data increases, the data cannot be stored within the jurisdiction of an operating system, so it is allocated to more disks managed by the operating system, but it is not convenient to manage and maintain, A distributed file management system is urgently needed to manage files on multiple machines.

Academic definition: a distributed file system is a file system that allows files to be shared on multiple hosts over the network. It allows multiple users on multiple hosts to share files and buckets. There are many distributed file management systems, and hdfshdfs is only one of them. It is applicable to one write and multiple queries, and does not support concurrent writes. It is not suitable for small files. Because a small file also occupies one block, the more small files (1000 1 K files) The larger the namenode pressure.

Ii. Basic concepts of HDFS

The files uploaded through the hadoop shell are stored in the datanode block, and the files cannot be seen through the Linux Shell. Only the blocks can be seen. HDFS can be described in one sentence:Store large client files in data blocks of many nodes. Here, three keywords are displayed: file, node, and data block. HDFS is designed around these three keywords. When learning, we must also grasp these three keywords for learning.

III. Basic Structure of HDFS namenode

1. Role

Namenode is used to manage the file directory structure and receive user operation requests. It is used to manage data nodes. The Name node maintains two sets of data. One is the relationship between the file directory and the data block, and the other is the relationship between the data block and the node. The previous set of data is static and stored on the disk and maintained through fsimage and edits files. The latter set of data is dynamic and does not last stored on the disk, this information is automatically created when the cluster is started, so it is generally stored in the memory.

Therefore, it is the management node of the entire file system. It maintains the file directory tree of the entire file system, the metadata of the file/directory and the data block list corresponding to each file. Receives user operation requests.

Files include:

① Fsimage (File System image): Metadata image file. Stores the namenode memory metadata information for a certain period of time.

② Edits: operation log file.

③ Fstime: Save the last checkpoint time

These files are stored in the Linux File System.

2. Features

<1> A file system that allows files to be shared on multiple hosts over the network, allowing multiple users on multiple hosts to share files and buckets.

<2> permeability. This allows you to access files through the network. In the view of programs and users, it is like accessing a local disk.

<3> fault tolerance. Even if some nodes in the system are offline, the system can continue to operate without data loss.

<4> applicable to one write and multiple queries. Concurrent writes are not supported, and small files are not suitable.

3. directory structure

<1> since namenode maintains so much information, where is the information stored? There is a file in hadoop source code called hdfs-default.xml, as shown in 3.1.

Fig 3.1

<2> open the file. There are two configuration information in rows 149th and 158th, one is DFS. Name. dir, and the other is DFS. Name. edits. dir. The two files indicateStorage location of core files fsimage and edits of namenode, 3.2.

Fig 3.2

The value of the corresponding configuration is $ {}, which is the expression of the variable. The ER expression reads the value of the variable when the program reads the file. The value of the 150th-row variable hadoop. tmp. dir (that is, the hadoop temporary storage path), as shown in Figure 3.3.

Fig 3.3

But in our configuration file core-site.xml in the previous chapter, the configuration value is/usr/local/hadoop/tmp.

<3> we can enter the Linux File System, execute the command CD/usr/local/hadoop/Conf, more core-site.xml to view the content shown in 3.3.

Fig 3.4

We can see that the two files are stored in the/usr/local/hadoop/tmp/dfs/name directory of the Linux File System.

<4> enter this directory and view the contents of the directories, as shown in Figure 3.5.

Fig 3.5

It can be seen that the core files fsimage and edits of namenode are stored in the current directory. At the same time, the name directory contains a file in_use.lock and the content is empty when you view the file, that is to say, only one namenode process can access this directory. You can try it yourself. When hadoop is not enabled, there is no file in_use.lock in this directory, this file is generated only after hadoop is started.

<5> file fsimage is the core file of namenode and is very important. If the file is lost, namenode cannot be used. How can this prevent the loss of the file from causing adverse consequences. I can look at a piece of code 3.6 in the hdfs-default.xml again.

Fig 3.6

According to the description, this variable determines the storage location of the DFS namenode nametable (fsimage) on the local file system. If this is a directory separated by commas (,), The nametable will be repeatedly copied to all directories for redundancy (backup to ensure data security ). For example, $ {hadoop. tmp. dir}/dfs/Name ,~ /Name2 ,~ /Name3 ,~ /Name4. Then, the fsimage will be copied ~ /Name1 ,~ /Name2 ,~ /Name3 ,~ /Name4 directory. Therefore, these directories are generally stored on different machines, disks, and folders. The more dispersed, the better. This ensures data security. Someone may ask how to implement it on multiple hosts? In fact, there is an NFS File Sharing System in Linux, which is not detailed here.

<6> take a look at the description of edits and look at a piece of code 3.7 in the hdfs-default.xml, as shown in

Fig 3.7

According to the description, this variable determines the location of the dfsnamenode storage transaction file (edits) on the local file system. If this is a list of directories separated by commas, the transaction files will be copied to all directories for redundancy. The default value is the same as DFS. Name. dir. (Edit stores the transaction process)

Iv. Basic Structure of HDFS-datanode

1. Role: datanode is used to truly store data in HDFS.

2. Block

<1> if a file is very large, such as 100 GB, how can it be stored in datanode? Datanode reads and writes data in blocks when storing data. Block is the basic unit for HDFS to read and write data.

<2> assume that the file size is 100 Gb. Starting from the byte location 0, every 64 MB bytes is divided into a block. Therefore, many blocks can be divided. Each block is 64 MB.

2.1 let's take a look at the org. Apache. hadoop. HDFS. Protocol. Block class. The following attributes are shown in section 4.1.

Fig 4.1

It can be seen that none of the attributes in the class can store data. Therefore, block is essentially a logical concept, meaning that the block does not actually store data, but only divides files.

2.2 Why must it be divided into 64 MB? Because this is set in the default profile, we view the core-default.xml file, as shown in 4.2.

Fig 4.2

The ds. Block. name parameter in indicates the block size. The value is 67, 108, 864 bytes, and can be converted to 64 MB. If we don't want a 64 MB size, We can override this value in the core-site.xml. Note that the unit is byte.

2.3 Copies

<1> copies are backups for security purposes. Because the cluster environment is not reliable, the copy mechanism is used to ensure data security.

<2> the disadvantage of a copy is that it occupies a large amount of storage space. The more copies, the more space occupied. Compared with the risk of data loss, the cost of storage space is worthwhile.

<3> how many copies of a file are suitable? Let's look at the hdfs-default.xml file, 4.3.

Fig 4.3

As shown in Figure 4.3, the default number of copies is 3. This means that each data block in HDFS has three copies. Of course, each copy will certainly be allocated to different datanode servers as much as possible. Imagine: if the three copies of the backup data are on the same server, the server will be shut down. Will all the data be lost?

3. directory structure

3.1 since the block of datanode is divided into files, where are the allocated files stored? Let's look at the file core-default.xml, as shown in 4.4.

Fig 4.4

The value of DFS. Data. DIR is the location where the block is stored in the Linux File System. Variable hadoop. TMP. the dir value has been described earlier. It is/usr/local/hadoop/tmp, so DFS. data. the complete path of DIR is/usr/local/hadoop/tmp/dfs/data. Run the Linux Command, as shown in Result 4.5.

3.2 first click pietty to open another Linux terminal, upload a file jdk-6u24-linux-i586.bin, the file size is 84927175 K, 4.5.

Figure 4-5

Then we can view the uploaded files on the original terminal, that is, in the/usr/local/hadoop/tmp/dfs/data directory of the Linux file system, as shown in Figure 4.6.

Fig 4.6

Files starting with "BLK _" are blocks that store data. The name here is regular. In addition to block files, there are also files with the suffix "meta". This is the source data file of the block and stores metadata information. Therefore, there are only two block files.

Note: from LinuxUpload a complete file to the HDFSIn LinuxYes, but uploaded to HDFSThere will not be a corresponding file, but it will be divided into many blocks.Exist. In addition, because our hadoop installation method is pseudo-distributed, there is only one node, and both datanode and namenode are on this node, the uploaded block is still in the Linux system.

V. Basic Structure of HDFS Secondarynode

HA solution. However, hot backup is not supported. Configuration. The more data operations, the larger the size of edits files, but the unlimited expansion is not allowed. Therefore, you need to convert the log process into fsimage. Namenode must be able to respond quickly to user requests to accept user operation requests. In order to ensure the rapid response of namenode to users, the work is handed over to secondarynode, so he also backs up part of fsimage.

Execution Process: Download the metadata information (fsimage, edits) from namenode, merge the two, generate a new fsimage, save it locally, and push it to namenode, reset the edits of the namenode. it is installed on the namenode node by default, but so... insecure!

The merging principle is shown in 5.1.

Fig 5.1

Hadoop diary day5 --- in-depth analysis of HDFS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.