On the HDFs file system under Hadoop

Last Update:2014-12-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop under HDFs file system

Here we have the basic concept of Hadoop, historical functions do not do too much elaboration, focusing on his file system to do some understanding and elaboration.

HDFS (Hadoop Distributed File System) is a distributed filesystem. With high fault tolerance (fault-tolerant), it allows him to deploy on inexpensive hardware. He can provide high throughput rates to access the application's data. HDFs relaxes the requirements for portable operating system interfaces. This allows the data in the file system to be accessed in a streaming format.

Design Objectives for HDFs:

Detect and quickly reply to hardware failures
Streaming data access
Simplifying the consistency model
Communication protocols

HDFS Architecture

650) this.width=650; "title=" 12.png "style=" HEIGHT:499PX;WIDTH:694PX; "src=" http://s3.51cto.com/wyfs02/M00/54/87/ Wkiol1sfgutx4inkaahqyhjabsc552.jpg "width=" "height=" 683 "alt=" wkiol1sfgutx4inkaahqyhjabsc552.jpg "/>

The architecture of HDFs employs a master-slave (Master/slave) model, and an HDFS cluster consists of a namenode and several datanode, where Namenode is the primary server that manages the namespace and file operations of the file's decency. ; Datanode manages the stored data. HDFs allows users to store data in the form of files. Internally, the file is partitioned into blocks of data, which are stored in a set of Datanode. The Namenode unified Dispatch class to create, delete, and copy files. (User data will never go through Namenode)

Hadoop and distributed development

What we commonly call distributed systems is distributed software systems, which are distributed processing software systems, including

Distributed operating system

Distributed programming language and its compilation (interpretation) system

Distributed File System

Distributed Database System

Hadoop is a layer in a file system in a distributed software system. It realizes the function of distributed file system and partial distributed database.

In the region, HDFs enables efficient storage and management of data in a cloud of compute clusters.

Similar characteristics of HDFS distributed systems and other systems:

The namespace for the entire cluster
A model that has data consistency and is suitable for writing multiple reads and writes at a time, and the client cannot see the existence of the file until the file is successfully created
The file is divided into multiple ask price blocks, each file is allocated to the data node, and the security of the data is guaranteed based on the configuration of the copied file blocks.

Next, please learn by reference

The management of HDFS data through specific operation

(1) File write

Client requests to Namenode to initiate a file write
Namenode returns the information of the Datanode that the client manages, based on the file size and the configuration of the file block.
The client divides the files into blocks and writes them sequentially to each datanode block according to the Datanode address information.

(2) file read

Client initiates a request to Namenode to read the file
Namenode return datanode information for file storage
Client reads file information

(3) file blocks (block) replication

Namenode found that the block of some files does not meet the minimum number of copies of this requirement or some datanode fail
Notify Datanode to duplicate each block
Datanode began to replicate directly with each other.

The functions of HDFS in System management

Heartbeat detection
Data replication
Data validation
Single Namenode If the failed task processing information is logged in the local file system and the remote file system
pipelined Writing of data
Safe Mode

HDFs is a simple introduction to this if there are deficiencies in the area please forgive, this document is only for learning reference.

This article from "Round Circle dot point" blog, declined reprint!

Talking about the HDFs file system under Hadoop

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

On the HDFs file system under Hadoop

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

On the HDFs file system under Hadoop

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support