Trivial-hadoop 2.2.0 pseudo-distributed and fully distributed installation (centos6.4), centos6.4 installation tutorial
The environment is centos6.4-32, hadoop2.2.0
Pseudo distributed document: http://pan.baidu.com/s/1kTrAcWB
Fully Distributed documentation: http://pan.baidu.com/s/1hqIeBGw
It is somewhat different from 1.x, 0. x, especially yarn.
There is an episode here. When configuring yarn in fully distributed mode, you must specify the ResourceManager address.
This address does not need to be specified in pseudo-distributed mode, because the default address is 0.0.0.0, that is, local
However, this parameter must be specified in fully distributed mode, because if you do not specify slave, you do not know which server is ResourceManager.
In 2.x, datanode not only reports HDFS to namenode, but also reports nogemanager to ResourceManager.
In this case, the namnode and datanode processes are started after the startup, but the hadoop cluster displays
The active nodes of is 0.
There are two configuration methods. If you use the default port, you only need to change the variable yarn. resourcemanager. hostname.
This variable is similar to JAVA_HOME and used for reference by others.
If your port does not use the default value, change $ {yarn. resourcemanager. hostname} to the master's
Name or address
How to track errors in a pseudo-distributed environment configured with hadoop220
Enable debug export HADOOP_ROOT_LOGGER = DEBUG. view the original post on the console>
Thank you!
Can hadoop distributed and pseudo-distributed be implemented on one machine at the same time? How to implement it?
It should be okay, because the difference between hadoop is that the installed folders are different,
You can run start-all in different hadoop folders at startup.