Finally, the Hadoop fully distributed environment is successfully built. The longest delay is that datanode cannot be started. It is actually a low-level error.
Three virtual machines, one master and two salve instances.
The first error, view the log, is a spelling error in the hdfs-site.xml file configuration.
From the second time to the nth time, slave was unable to connect to the master. The following is the direction of my check:
1. Check whether the firewall of the master and slave is disabled.
2. Check the configuration file again: fs. default. name and mapred. job. tracker value master and ip.
3. after the first error is modified, format namenode fails because of the configured dfs. name. dir and dfs. data. the dir file directory is not deleted. You need to manually delete the related folders of the master and slave.
4. ssh connection. After configuring ssh password-less login, you need to connect once (the first connection will let you enter yes ?), This is where I spent the longest time. Master connects to slave, and slave also connects to master. Then, run format to start hadoop, and check that the jps process is started successfully.
In this environment, I learned to try to analyze the logs to find the problem and eliminate the possible causes step by step to get the final cause.
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)