What is 1.HDFS?
The Hadoop Distributed File System (HDFS) is designed to be suitable for distributed file systems running on general-purpose hardware (commodity hardware). It has a lot in common with existing Distributed file systems.
Basic Concepts in 2.HDFS
(1) blocks (block)
"Block" is a fixed-size storage unit, HDFS files are partitioned into blocks for storage, HDFs block default size is 64MB. After the file is delivered, HDFs splits the file into blocks for management, and "block" is the logical unit of file storage processing.
(2) HDFs has two types of nodes: NameNode and DataNode
Namenode is the management node for HDFs, which stores file metadata.
The meta data here consists of two parts:
--->1. File and Data block mapping table
--->2. Mapping tables for data blocks and data nodes
Datenode is the working node of HDFs, which stores data blocks.
3.HDFS Architecture:
Customers who want to access the data will first send a request to the Namenode query metadata. By reading the returned results, you know which nodes the files are stored on. So to these nodes to get the database, after downloading the data block, assembled in the assembly into complete data, that is, we want to file.
Big Data Note 04: HDFs for Big Data Hadoop (Distributed File System)