Hadoop operation emergency solution

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction:

During this time, hadoop and Lucene were involved. I have summarized the solutions for hadoop problems during operation. Please advise!

Emergency solutions for HDFS (0.20.2) Operation 1
Namenode disconnection (secondarynamenode is not affected)

If namenode fails, if it can get up immediately, the start-dfs.sh can be used again. Otherwise, follow the steps below. The complete secondarynamenode is provided for all the following operations.

1)
On a non-secondarynamenode server, select datanode as the namenode. (Currently not found in the official documentation. The second type is recommended, but no problems are found in the test)

A)
Kill all services.

B)
Modify the new namenode server configuration file: core-site.xml, masters, slaves and other related files.

C)
Modify the hosts file

D)
Reconfigure SSH for each node so that the new namenode can log on to another datanode normally without a password.

E)
Copy hadoop. tmp. DIR/dfs/namesecondary on the machine running secondarynamedode to the hadoop. tmp. DIR/DFS directory on the new namenode server.

F)
Rename namesecondary to name

G)
Bin/start-dfs.sh start HDFS.

2)
On a non-secondarynamenode server, select datanode as the namenode. Restore the namenode by importing the previous checkpoint.

A)
Kill all services.

B)
Modify the new namenode server configuration file: core-site.xml, masters, slaves and other related files.

C)
Modify the hosts file

D)
Reconfigure SSH for each node so that the new namenode can log on to another datanode normally without a password.

E)
Configure fs. Checkpoint. dir in the namenode server core-site.xml (default is in $ hadoop. tmp. DIR/dfs/namesecondary ).

<Name> fs. Checkpoint. dir </Name>

<Value>/home/hadoop-data/dfs/namesecondary </value>

</Property>

F)
Copy hadoop. tmp. DIR/dfs/namesecondary on the machine running secondarynamedode to the fs. Checkpoint. dir directory of the namenode server.

G)
Run bin/hadoop namenode-importcheckpoint to import the checkpoint.

H)
Run the bin/start-dfs.sh to start DFS.

2
Datanode disconnection (without secondarynamenode)

1)
The original server is completely damaged and cannot be started. Only New datanode can be introduced.

I.
Copy all hadoop configurations from other datanode to the new server

II.
Set hosts and set hosts for all datanodes and namenode

III.
Set SSH password-less login and Test

IV.
Configure the newly added datanode In the slaves of namenode Conf

V.
Start datanode through the bin/hadoop-daemon.sh on the newly added datanode
Start the new datanode.

2)
The original server can be started immediately.

I.
Because namenode slaves has this datanode, you can directly execute bin/start-dfs.sh startup in namenode

II.
You can also use the bin/hadoop-daemon.sh start
Start datanode

3
Datanode disconnection (with secondarynamenode)

1)
If namenode is running normally, if this datanode can be immediately put into use, execute bin/start-dfs.sh launch directly in namenode

2)
When namenode is running normally, if the datanode cannot be used, consider adding a datanode and configuring secondarynamenode.

On the new node
Configuration in the profile hdfs-site.xml:

<Name> DFS. http. address </Name>

<Value> Netease-namenode-test: 50070 </value>

</Property>

Use the default configuration in namenode. If you access Netease-namenode-test: 50070 over the Internet, the access may fail due to different network segments.

Enable secondarynamenode to post requests to namenode.

Add a new secondarynamenode in namenode masters and configure hosts.

Start with the bin/start-dfs.sh.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop operation emergency solution

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support