Hadoop fully distributed under Datanode Unable to start the workaround

Source: Internet
Author: User

Problem Description:

When the node is changed in cluster mode, startup cluster Discovery Datanode has not been started up.

I cluster configuration: There are 5 nodes, respectively, the master slave1-5.

In master with Hadoop user execution: start-all.sh

JPS to view the master node boot situation:

NameNode

Jobtracker

Secondarynamenode

Have been started normally, using master:50070, Live Nodes is 0, with access to SLAVE1:

SSH slave1, input command JPS, found only Tasktracker and not datanode. Then read the log

The Internet to find solutions, finally solved, the solution is as follows:

1. Perform stop-all.sh to suspend all services first

2. Remove the TMP on all salve nodes (that is, the Dfs.data.dir folder specified in Hdfs-site.xml, datanode the location where the data block resides), delete the Logs folder, and then re-establish the TMP, logs folder

3. Delete the core-site.xml under/usr/hadoop/conf on all Salve nodes and copy the Core-site.xml files from the master node to each salve node

scp/usr/hadoop/conf/core-site.xml [Email protected]:/usr/hadoop/conf/

4. Reformatting: Hadoop Namenode-format

5. Start: start-all.sh

In addition may also meet Slave's Datanode error:

Error 1,hadoop Datanode problem INFO org.apache.hadoop.ipc.RPC:Server at/:9000 not available yet, zzzzz.

Workaround See: Http://blog.sina.com.cn/s/blog_893ee27f0100zoh7.html,

Error 2,slave node Datanode cannot connect to master, the log information is: info org.apache. Ipc. Client:retrying Connect to server:master/172.16.0.100:9000. Already tried 0 time (s);

Workaround:

1, Ping master can pass, Telnet Master 9000 can not pass, indicating that the firewall is turned on
2, shut down the master host firewall, you can clear all rules by/sbin/iptables-f to temporarily stop the firewall
If you want to clear, execute/sbin/iptables-p INPUT ACCEPT First, then execute/sbin/iptables-f

Note: This is the situation I encountered, not necessarily the problem you encountered, basically from the following aspects to solve the problem:
1. Check that each XML file is configured correctly

2. The Java environment variable is configured correctly

3. Does SSH have no password interoperability

4, Hadoop out of security mode, Hadoop dfsadmin-safemode leave.

You can also refer to this: http://blog.sina.com.cn/s/blog_76fbd24d01017qmc.html

This article is reproduced from http://blog.csdn.net/daniel_ustc/article/details/10834413

Hadoop fully distributed under Datanode Unable to start the workaround

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.