Hadoop NameNode SafeMode

Source: Internet
Author: User
Tags hadoop fs

Recently in the application of Hadoop cluster, encountered the task to submit the cluster, long-time card in the accepted state, the application of resources difficult situation, after a series of log analysis, the state of the investigation, only to find that the namenode has been caused by the primary and standby switch, The previous Namenode primary node has been down for some reason, causing the primary and standby switchover to occur, just the time card at the peak of business, some data blocks are not synchronized (or other reasons, from the log should be unable to interact with the Journalnode cluster, guessing possible network anomalies, but not the relevant log information), Namenode after the switch was forced into safe mode has not come out, resulting in subsequent tasks more and more difficult to apply for resources, although the cluster is intact, the business almost collapsed ...

The reasons for downtime are yet to be analyzed, and the immediate recovery requirements for the business are higher, and the Namenode security model becomes the most important issue.

After a search, I found the blog on the right: "Hadoop security model detailed and Configuration"

In simple terms, HDFs in Safe Mode only supports the operation of metadata, does not support the creation of files, delete, and so on, the data block is also a large number of checks, resulting in the allocation of resources and the application time is far exceeding expectations.

Therefore, when the business emergency requires recovery, you can try to reduce the following two parameters, so that the security mode to end as soon as possible:

dfs.namenode.replication.min--minimum number of copies to meet your needs
Proportion of data blocks in a dfs.namenode.safemode.threshold-pct--cluster that meet normal configurations
< Property>  <name>Dfs.namenode.replication.min</name>  <value>1</value>  <Description>Minimal block replication. </Description></ Property>< Property>  <name>dfs.namenode.safemode.threshold-pct</name>  <value>0.999f</value>  <Description>specifies the percentage of blocks that should satisfy the minimal replication requirement defined by Dfs.namen Ode.replication.min. Values less than or equal to 0 mean not to wait for any particular percentage of blocks before    Exiting SafeMode.  Values greater than 1 would make safe mode permanent. </Description></ Property>


If you are trying to prevent some exception data from always being verified, You can try to restart Namenode after you have set the above dfs.namenode.safemode.threshold-pct parameter to 0 or a value smaller than 0, which will never enter Safe mode, or manually exit Safe mode with the following command:

Hadoop Fs–safemode Leave

Hadoop NameNode SafeMode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.