Introduction to the second-HDFS architecture of the Big Data Technology Hadoop introductory theory series

Last Update:2016-01-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

HDFs Simple Introduction

The HDFs full name is the Hadoop distribute file system, a distributed filesystem that can run on ordinary commodity hardware.
Notable differences from other Distributed file systems are:

HDFs is a highly fault-tolerant system and can operate on a variety of low-cost hardware;
Provides high throughput, suitable for storing large data sets;
HDFS provides streaming data access mechanisms.
HDFs originates from Apache Nutch and is now the core sub-project of the Apache Hadoop project.

HDFs design assumptions and goals

hardware error is normal
In the data center, hardware anomalies should be seen as normal rather than abnormal.
In a big Data environment, the HDFS cluster has a large number of physical machines, each machine consists of a lot of hardware, the whole because of a component error and the probability of error is very high,
So one of the core design goals of the HDFS architecture is the ability to quickly detect hardware failures and quickly recover from failures.
Streaming access Requirements
Applications running on the HDFs cluster require streaming access to the data, and HDFs is designed for batching rather than interactive processing, so the architecture is designed with greater emphasis on high throughput rather than low latency.
The standard access mechanisms for POSIX, such as random access, can severely degrade throughput, and HDFS ignores this mechanism.
Big Data sets
Assuming that the typical file size for HDFs is GB or even terabytes, the HDFs design focuses on supporting large files, supporting the expansion of the machine number to support larger clusters.
A single cluster should provide a massive amount of file support
Simple Consistency Model
The access model provided by HDFs is a write-once-read model. The file remains intact as it is written, simplifying the data consistency model and enabling higher throughput for applications.
File append is also supported.
mobile computing costs less than moving data
HDFs leverages the principle of data localization in computer systems, and considers that the closer the data is to the CPU, the higher the performance.
HDFS provides an interface for the physical storage location of application-aware data.
compatible with heterogeneous hardware and software platforms
HDFs is designed to be easily migrated from one platform to another

HDFS Application Scenario

In combination with the above design assumptions and behind-the-scenes architecture analysis, HDFS is particularly suitable for the following scenarios:

Sequential access
such as providing streaming media services and other large file storage scenarios
Full-volume access to large files
If you require full access to massive amounts of data, OLAP, etc.
Limited overall budget
Want to take advantage of the convenience of distributed computing, and not enough budget to buy HPC, high-performance minicomputer and other scenarios
The performance is not satisfactory in the following scenarios:
Low Latency Data access
Low-latency data access means fast data positioning, such as a 10ms level response, and if the system is busy responding to such requirements,
is inconsistent with the assumptions that quickly return large amounts of data.
Large number of small files
A large number of small files will consume large chunks of file, resulting in a significant waste and a serious challenge to metadata (Namenode).
Multi-User Concurrent writes
Concurrent writes violate the data consistency model, and the data may be inconsistent.
Real-time updates
HDFs supports append, and real-time updates reduce data throughput and increase the cost of maintaining consistent data.

HDFS Architecture

This article will analyze the HDFS architecture from the following aspects to explore how the HDFS architecture meets the design objectives.

HDFs Overall architecture

The following HDFs frame composition comes from the official Hadoop website.

As can be seen from the above, HDFs takes the master-slave c/S architecture, the HDFS node is divided into two roles:

NameNode
Namenode provides file metadata, access logs and other properties of the storage, operation functions.
The basic information of the file is stored in Namenode, and the centralized storage scheme is adopted.
DataNode
Datanode provides the storage and operation functions of the file contents.
The file data blocks themselves are stored in different datanode, Datanode can be distributed in different racks.
The client of HDFs accesses Namenode and Datanode, respectively, to obtain the meta-information and content of the file. The client of the HDFs cluster will
Direct access to Namenode and Datanode, and related data is transferred directly from Namenode or Datanode to the client.

HDFs Data Organization mechanism

The data organization of HDFS is divided into two parts, the first is the Namenode part, the second is the Datanode data section, and the organization chart of the data is as follows:

NameNode
Based on the yarn architecture of HDFs, Namenode adopts the master-slave design, the host is responsible for the client access metadata requirements, as well as the storage block information.
The slave is responsible for the real-time backup of the host, and periodically merges the user action records and file records into the block storage device and writes it back to the host.
When the host fails, the slave takes over all the work of the host. The master-slave Namenode works together as follows:

Technorati Tags: hdfs,hadoop,namenode, big Data, architecture

DataNode
Datanode is responsible for storing real data. The file in Datanode is based on a block of data, and the block size is fixed. The same block of data in the entire cluster
will be saved in multiple copies, stored separately in different datanode. The size of the data block, the number of replicas is determined by the Hadoop configuration file parameters. Data block Size,
The number of replicas can be modified after the cluster is started, and after the modified parameters are restarted, the existing files are not affected.
Datanode will scan the number of physical blocks in the local filesystem and report the corresponding data block information to Namenode.

HDFs data access mechanism

The file access mechanism of HDFS is a streaming access mechanism, that is, after opening a data block of a file through an API, you can read or write to a file sequentially, and you cannot specify
Read the file and then perform a file operation.
Because there are multiple roles in HDFs, and the corresponding scenario is primarily a one-time read-write scenario, the way it reads and writes is significantly different. Both read and write operations are performed by
The client initiates and controls the entire process, and the server roles (Namenode and datanode) are passive responses.
The following are described separately:

Read process
When a client initiates a read request, it first connects to the Namenode machine, and the HDFs configuration file is also required to connect, so it knows the information about each server. Connection creation
Upon completion, the client requests to read a block of data from a file, Namenode in memory to see if there is a corresponding file and file block, if no
Notifies the client that the corresponding file or block does not exist. If there is a notification to the client that the corresponding data block exists on which server, the client determines that after receiving the information, the corresponding data
Connect and start the network transfer. The client arbitrarily selects one of the replica data for the read operation.

Process Analysis? Use the Client development library clients provided by HDFS to initiate RPC requests to remote Namenode;
? Namenode returns a partial or all block list of files as appropriate, and returns the Datanode address of the block copy for each block,namenode;
Client-side Development library clients will pick the Datanode closest to the client to read the block, and if the client itself is Datanode, the data will be fetched directly from the local location.
After reading the current block data, close the current Datanode connection and find the best datanode for reading the next block;
? When the block of the list is read and the file read is not finished, the client development library continues to fetch the next block list to Namenode.
After reading a block, checksum authentication is performed, and if an error occurs while reading the Datanode, the client notifies Namenode and continues reading from the next Datanode that owns the block copy.

Write process
When the client initiates a write request, the namenode retrieves in memory to see if there are corresponding files and file blocks, and if so, notifies the client that the corresponding file or block already exists.
If not, notify the client of a server as writing to the primary server. Namenode also notifies the write master that the client is ready to communicate with the primary server and write the data.
The primary write server writes the data to the physical disk, and after the write is completed, it obtains its next replica server address with the Namenode communication, confirming the address and passing the data to it, so
A baton write, until the number of sets of replicas, and so on the last copy of the write completed, the same write success failed to the baton return to the client, and finally
The client notifies the Namenode that the data block was successfully written, and if one of them fails, the entire write fails.

Process Analysis

? Use the Client development library clients provided by HDFS to initiate RPC requests to remote Namenode;
? Namenode checks whether the file to be created already exists, whether the creator has permission to operate, and the success of the fileCreate a record, otherwise it will cause the client to throw an exception;
? When the client starts writing to the file, it splits the file into multiple packets and internally manages the packets in the form of a data queue, and applies a new blocks to Namenode. Gets the appropriate list of datanodes to store replicas, depending on the settings for replication in Namenode.
? start writing packet to all replicas in the form of pipeline (pipe). Writes the packet to the first datanode in a stream, the Datanode stores the packet, and then passes it to the next pipeline in this Datanode, until the last Datanode, the way the data is written in pipelined form.
? After the last Datanode is successfully stored, an ACK packet is returned, passed to the client in pipeline, and an "ACK queue" is maintained inside the client's development library, and the ACK packet returned by the Datanode is received from the "ACK Queue "removes the corresponding packet.
? If a datanode fails during transmission, the current pipeline is closed and the failed Datanode is removed from the current pipeline. The remaining blocks will continue to be transmitted as pipeline in the remaining Datanode, while Namenode will be assigned a new datanode, maintaining the replicas set number.

HDFs Data security mechanism

The security mechanism of the HDFs file system takes the class Linux ACL security access mechanism. Each file inherits the access rights of its parent object, the directory, by default, and the default user and genus are from
The user who uploaded the client. The associated control method is similar to Linux, where a user can be assigned read and write access to a file through a command or API. When the user does not have the corresponding permissions,
If the file read and write operation will get the corresponding error prompt.

HDFs High Availability mechanism

HDFs as a highly available cluster, its usability design is very hard, mainly reflected in:

Namenode Master-Slave design
The master-slave design ensures the reliability of metadata and solves the problem of single point failure in HDFs 1.0. Refer to the above description for details
Data copy mechanism
The data copy mechanism ensures that a file block stored on a server is compromised for some reason, and the entire cluster can be provided externally.
File Access Service, please refer to the data Access mechanism section above.
Data recovery mechanism
Data recovery here means that HDFs provides a time-out window period, and the deleted files in the default system are moved to the Trash directory.
After some time, HDFs is cleared out, and this mechanism is commonly used in cloud storage. If a block of data fails, the copy mechanism can be restored.
Rack-aware mechanism
The organization of large clusters is organized in the form of a rack, a machine with a fixed number of servers and corresponding network equipment to form a cabinet, generally speaking, cross-rack network IO is always higher than the same rack, of course, if the cross-room is more expensive. So HDFs is always looking for ways to keep data in a better-performing server to improve performance while trying to save data to different racks to ensure data fault tolerance. Typical rack topologies and replicas are as follows:
When the app reads data, HDFs always chooses a server that is closer to the app.
Snapshot mechanism
Automatic error detection recovery mechanism
Machine failure detection through heartbeat detection, if in a period of time, Datanode or namenode can not return heartbeat, the main namenode will be marked as an outage server, and then new IO request will not be forwarded to this server, At the same time the corresponding file if there is a server down due to the number of replicas not up to the specified number, HDFS will re-copy some copies of the file to ensure the reliability of the entire cluster.
Checksum mechanism
A checksum is a checksum that is generated for each block of data, and when the data is read again, the client calculates it and compares it to the checksum on the server, ensuring that the data is not tampered with by network transmission or otherwise.

HDFS cluster expansion mechanism

The dynamic expansion of the cluster is convenient for users to expand and shrink the cluster in a dynamic way. If a new server is added, the subsequent IO will have more opportunities to be
Sent to the new server to execute, the full distribution of the existing files in the cluster can be done by command, but the data redistribution will only occupy a small amount of network Io, so that the application on the cluster will not be significantly affected by redistribution. The same machine lower frame is also carried out by command, when the cluster shows a similar situation with the machine, it will no longer send IO requests and re-copy to ensure the number of replicas.

Reference documents:

HDFS Design Document
Introduction to the principle, architecture and characteristics of HDFS

Introduction to the second-HDFS architecture of the Big Data Technology Hadoop introductory theory series

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to the second-HDFS architecture of the Big Data Technology Hadoop introductory theory series

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to the second-HDFS architecture of the Big Data Technology Hadoop introductory theory series

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support