Hadoop machine room down after ...

Source: Internet
Author: User
Tags safe mode
Due to abnormal power outage in the company room, the Hadoop cluster data of the test environment is all down. After restarting the Hadoop cluster, visit: http://xxx.xxx.xxx.xx:50070/dfshealth.html
Tips:
NameNode is still loading. Redirecting to the Startup Progress page.
As shown in figure:


This is not a mistake .。 Since Hadoop starts, it goes into safe mode for self-test, and self-test is done.

Let 's take a look at the status change of Namenode reboot:

Namenode boot, the cluster will be in Safe mode.
To view the status of a cluster:
[Hduser@hadoop0 bin]$ HDFs dfsadmin-safemode get
Safe mode is on


Any file, directory, or block in HDFs is represented as an object stored in Namenode memory. This object occupies the memory space of bytes. At namenode startup, Namenode loads the fsimage (Filesystem image: Persisted checkpoint of file meta information) file into memory and then performs the appropriate operation in Editlog. The load fsimage file contains file metadata information, but does not contain information about the location of the file block.

The information of the file block location is the scanning of the local disk at all Datanode startup, after which the block information is Datanode saved, and then the information is reported to Namenode,namenode received each Datanode block Information report, The received block information and its datanode are assembled into blockmap and stored in Namenode memory.

Finally leave Safe mode, Namenode is officially available.
In view the status of the cluster:
[Hduser@hadoop0 bin]$ HDFs dfsadmin-safemode get
Safe mode is OFF

To view the Namenode log:
2015-08-17 10:05:22,298 INFO org.apache.hadoop.hdfs.statechange:state* Safe mode on, Safe mode extension. 
The reported blocks 450638 have reached the threshold 0.9990 of total blocks 450676. The number of live Datanodes 5 has reached the minimum number 0. In Safe Mode extension. Safe mode is turned off automatically in 9 seconds.
2015-08-17 10:05:25,844 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning After 30000 milliseconds
2015-08-17 10:05:25,846 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 Directive (s) and 0 block (s) in 1 Millisecond (s).
2015-08-17 10:05:32,299 INFO org.apache.hadoop.hdfs.statechange:state* leaving Safe mode after 727 secs
2015-08-17 1 0:05:32,299 INFO org.apache.hadoop.hdfs.statechange:state* Safe mode is OFF


Because Datanode reports the progress of block information to Namenode, when 99.9% block reports, it will leave safe mode.
After self-inspection, visit: http://xxx.xxx.xxx.xx:50070/dfshealth.html. Will find that some blocks have been unable to recover. As shown in figure:

These unrecoverable files, each access to the page will be displayed on the page, you can delete these corrupted files by the following command:
./hdfs Fsck-delete

You can also view the corrupted files by using the following command:
./hdfs fsck/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.