Objective
The goal of this document is to provide a learning starting point for users of the Hadoop Distributed File System (HDFS), where HDFS can be used as part of the Hadoop cluster or as a stand-alone distributed file system. Although HDFs is designed to work correctly in many environments, understanding how HDFS works can greatly help improve HDFS performance and error diagnosis on specific clusters.
Overview
HDFs is one of the most important distributed storage systems used in Hadoop applications. A HDFS cluster consists primarily of a namenode and many Datanode: Namenode manages the metadata of the file system, while Datanode stores the actual data. The HDFS architecture is described here in detail. This document focuses on how users and administrators interact with HDFs. The diagram in the HDFs architecture design describes the basic interaction between Namenode, Datanode, and the client. Basically, the client contacts Namenode to get the metadata or cosmetic attributes of the file, while the real file I/O operation is directly interacting with the Datanode.
Some of the more important features that most users are interested in are listed below. The terms in italics are highlighted in later chapters.
Hadoop (including HDFs) is ideal for distributed storage and computing on commercial hardware (commodity hardware) because it is not only fault tolerant and scalable, but also very easy to scale. The Map-reduce framework is well known for its simplicity and availability in large distributed system applications, which have been integrated into Hadoop. HDFs is highly configurable, and its default configuration can accommodate a large number of installation environments. In most cases, these parameters only need to be adjusted in a very large cluster environment. Developed in the Java language to support all the mainstream platforms. Supports class shell commands that can interact directly with HDFs. Namenode and Datanode have built-in Web servers that allow users to check the current state of the cluster. New features and improvements are regularly added to the HDFS implementation. The following list is part of the common features in HDFs: File permissions and authorization. Rack Sensing (Rack Awareness): Consider the physical location of a node when scheduling tasks and allocating storage space. Security mode: A maintenance-required management model. FSCK: A tool that troubleshoots file system health to discover missing files or blocks of data. Rebalancer: Balance the data load on the cluster when the data is unbalanced between the datanode. Upgrade and rollback: The state of being able to roll back to the pre-HDFs upgrade in the event of an unusual occurrence of the software update. Secondary Namenode: Controls the size of the HDFs change log file on the Namenode to a specific limit. Prerequisites
The following documentation describes how to install and build a Hadoop cluster:
Hadoop QuickStart for first-time users. The Hadoop cluster is built for large-scale distributed cluster.
The remainder of the document assumes that the user has installed and run a HDFs that contains at least one Datanode node. For the purposes of this article, Namenode and Datanode can run on the same physical host.
Web Interface
Namenode and Datanode each launch a built-in Web server that shows the current basic state and information of the cluster. Under the default configuration Namenode's home address is http://namenode:50070/. This page lists the basic state of all datanode and clusters in the cluster. This web interface can also be used to browse the entire filesystem (using the "Browse the file System" link on the Namenode home page).
shell command
Hadoop includes a series of shell-like commands that can interact directly with HDFs and other file systems supported by Hadoop. The Bin/hadoop fs-help command lists all the commands that are supported by the Hadoop shell. The Bin/hadoop fs-help command displays detailed information about a command. These commands support most common file system operations, such as copying files, changing file permissions, and so on. It also supports some HDFS-specific operations, such as changing the number of copies of a file.
dfsadmin Command
The ' bin/hadoop dfsadmin ' command supports some operations related to HDFS management. Bin/hadoop Dfsadmin-help can list all currently supported commands. Like what:
-report: Reports the basic state of HDFs. Some of the information can also be found on the Namenode Web Services homepage. -safemode: Although it is not usually necessary, administrators can actually manually let namenode into or out of safe mode. -finalizeupgrade: Deletes the cluster backup that was made during the last upgrade. Secondary Namenode
Namenode will append changes to the file system to a log file (edits) on the local file system. When a namenode is started, it first reads the state of the HDFs from an image file (Fsimage) and then applies the edits operation in the log file. It then writes the new HDFs state to (fsimage) and begins normal operation with an empty edits file. Because Namenode only merges fsimage and edits during the startup phase, the log files can become very large over time, especially for large clusters. Another side effect of the log file size is that the next Namenode startup takes a long time.
Secondary Namenode periodically merges fsimage and edits logs, controlling the edits log file size to a limit. Because the memory requirements and the primary namenode are on an order of magnitude, usually secondary namenode and the main namenode run on different machines. Secondary Namenode is started by bin/start-dfs.sh on the node specified in Conf/masters.
Rebalancer
HDFs data may not be evenly distributed across datanode. A common reason is that new datanode nodes are often added to existing clusters. When a new block of data is added (a file's data is stored in a series of blocks), Namenode takes a number of factors into account before selecting Datanode to receive the data block. Some of these considerations are:
places a copy of the block of data on the node that is writing the block of data. Try to distribute different copies of the data blocks in different racks so that the cluster can survive a complete loss of a rack. A replica is usually placed on a node of the same rack as the writing file, which reduces network I/o across the rack. Distribute HDFs data as evenly as possible in the Datanode of the cluster.
Since the above considerations require trade-offs, the data may not be evenly distributed across the datanode. HDFS provides administrators with a tool for analyzing data distribution on block distribution and rebalancing datanode. A PDF in the HADOOP-1652 attachment is a brief Rebalancer administrator's Guide.
Rack Perception (Rack awareness)
Typically, large Hadoop clusters are organized in a rack, and the network conditions between different nodes on the same rack are more desirable than between different racks. In addition, Namenode manages to keep copies of blocks of data in different racks to improve fault tolerance. Hadoop allows the administrators of the cluster to determine the racks where the nodes are based by configuring the Dfs.network.script parameters. When this script is configured, each node will run this script to get its rack ID. The default installation assumes that all nodes belong to the same rack. This feature and its configuration parameters are described in more detail in the PDF attached to HADOOP-692.
Safe Mode
When the namenode starts, it loads the file system state information from the Fsimage and edits log files, and then waits for each datanode to report their respective block states to it so that Namenode does not start replicating the blocks prematurely, even if the replicas are sufficient. At this stage, Namenode is in safe mode. The Namenode security model is essentially a read-only mode of the HDFs cluster, where the cluster does not allow any modification of the file system or block of data. Usually namenode automatically exits safe mode at the start stage. If necessary, you can also explicitly place HDFs in safe mode through the ' bin/hadoop dfsadmin-safemode ' command. Namenode The home page displays whether it is currently in safe mode. Refer to Javadoc:setsafemode () for more information and configuration about safe mode.
Fsck
HDFs supports the fsck command to check for various inconsistencies in the system. This command is designed to report problems with a variety of files, such as a missing block or insufficient number of copies. Unlike the traditional fsck tool on a local file system, this command does not fix errors that it detects. Generally, Namenode automatically corrects most recoverable errors. HDFs's fsck is not a hadoop shell command. It executes through ' bin/hadoop fsck '. Fsck can be used to check the entire file system or only some of the files.
Upgrades and rollback
When you upgrade Hadoop on an existing cluster, as with other software upgrades, there may be new bugs or some incompatible changes that affect existing applications. In any meaningful HDSF system, losing data is not allowed, let alone restarting the HDFs. HDFs allows the administrator to return to the previous version of Hadoop and roll the status of the cluster back to prior to the upgrade. More details on HDFs upgrades can be found on the upgrade wiki. HDFs can have one such backup at a time. Prior to the upgrade, the administrator needs to remove the existing backup files with the Bin/hadoop Dfsadmin-finalizeupgrade (upgrade finalization operation) command. The following is a brief introduction to the general upgrade process:
Before upgrading the Hadoop software, check to see if a backup is already in existence, and if so, perform the upgrade finalization operation to remove the backup. The dfsadmin-upgradeprogress status command enables you to know if you need to perform an upgrade finalization operation on a cluster. Stop the cluster and deploy the new version of Hadoop. Run the new version (Bin/start-dfs.sh-upgrade) using the-upgrade option. In most cases, the cluster can function correctly. Once we think that the new HDFs is working (perhaps after a few days of operation), you can perform an upgrade finalization operation on it. Note that deleting files that existed prior to the upgrade does not really free up disk space on the datanodes before the upgrade finalization operation is performed on a cluster. If you need to go back to the old version, stop the cluster and deploy the old version of Hadoop. Start the cluster with the rollback option (bin/start-dfs.h-rollback). File permissions and Security
File permissions here are similar to other common platforms such as Linux file permissions. Currently, security is limited to Simple file permissions. The user initiating Namenode is considered a HDFs superuser. Later versions of HDFS will support network authentication protocols (such as Kerberos) to authenticate user identities and encrypt data. For specific details, please refer to the Permission Usage Management Guide.
Scalability
Hadoop is now running on clusters of thousands of nodes. The Poweredby Hadoop page lists some of the organizations that have Hadoop deployed on their large clusters. The HDFs cluster has only one Namenode node. Currently, the size of available memory on Namenode is a major extension restriction. In a very large cluster, increasing the average size of the HDFs storage file can increase the size of the cluster without requiring additional namenode memory. The default configuration may not be suitable for large scale clusters. The Hadoop FAQ page lists configuration improvements for large Hadoop clusters.
Related Documents
This user manual is intended to provide a starting point for users to learn and use the HDSF file system. This document continues to improve, and users can refer to more Hadoop and HDFs documentation. The following list is the starting point for users to continue learning:
Hadoop Official homepage: All Hadoop related start pages. Hadoop wiki:hadoop Wiki Document Home page. Unlike this guide, which is part of the Hadoop code tree, the Hadoop wiki is regularly edited by the Hadoop community. FAQ on the Hadoop wiki. Hadoop JavaDoc API. Hadoop User Mailing list: core-user[at]hadoop.apache.org. View the Conf/hadoop-default.xml file. This includes a brief description of most of the configuration parameters.