Understanding the HDFS storage mechanism

Last Update:2015-01-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previous Article: HDFS storage mechanism in Hadoop

1. HDFS pioneered the design of a file storage method, that is, separate file storage after splitting;

2. HDFS splits the large files to be stored, splits them, and stores them in the established Block. It pre-processes the stored data through preset optimization and preprocessing modes, this solves the needs of storing and computing large files;

3. an HDFS cluster consists of two parts: NameNode and DataNode. Generally, one NameNode and multiple DataNode work together in a cluster;

4. nameNode is the master server of the Cluster. It is mainly used to maintain all files and content data in HDFS, and constantly reads and records the status and status of the DataNode host in the cluster, you can store images by reading and writing image log files;

5. DataNode acts as the task execution role in the HDFS cluster and is the working node of the cluster. The file is divided into several data blocks of the same size and stored on several DataNode. DataNode regularly sends its own running status and storage content to the NameNode in the cluster, work according to the commands sent by NameNode;

6. nameNode is responsible for receiving the information sent from the client, and then sending the file storage location information to the client that submits the request. The client can directly contact DataNode to perform operations on some files.

7. Block is the basic storage unit of HDFS. The default size is 64 MB;

8. HDFS can also back up multiple copies of stored blocks and copy each Block to at least three mutually independent hardware devices to quickly recover damaged data;

9. You can use the established API to operate files in HDFS;

10. when an error occurs in the client's read operation, the client reports an error to the NameNode, requests the NameNode to exclude the wrong DataNode, and then sorts it by distance again, to obtain a new DataNode read path. If all DataNode reports a read failure, the entire task fails to be read;

11. FSDataOutputStream will not immediately shut down any issues that occur during write operations. The client reports error information to NameNode and writes data directly to the DataNode that provides backup. The backup DataNode is upgraded to the preferred DataNode, and copies data in the remaining two DataNode. NameNode marks the wrong DataNode for later processing.

-------------------------------------- Split line --------------------------------------

Copy local files to HDFS

Download files from HDFS to local

Upload local files to HDFS

Common commands for HDFS basic files

Introduction to HDFS and MapReduce nodes in Hadoop

Hadoop practice Chinese version + English version + Source Code [PDF]

Hadoop: The Definitive Guide (PDF]

-------------------------------------- Split line --------------------------------------

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding the HDFS storage mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding the HDFS storage mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support