The latest hadoop2.5 installation directory has been modified to make installation easier.
First install the preparation Tool
$ sudo apt-get install ssh $ sudo apt-get install rsync
Configure SSH
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P ‘‘ -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Go to ETC/hadoop/hadoop-env.shConfigure the runtime environment
# set to the root of your Java installation export JAVA_HOME=/usr/java/latest # Assuming your installation directory is /usr/local/hadoop export HADOOP_PREFIX=/usr/local/hadoop
Configure HDFS port and backup count
ETC/hadoop/core-site.xml:
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property>
<Property> # When clientdatanodeprotocol calls getblocklocalpathinfo <Name> DFS. block. local-path-access.user </Name> <value> infomorrow </value> </property> <Name> DFS. replication </Name> <value> 1 </value> </property> <Name> hadoop. TMP. dir </Name> <value>/home/infomorrow/hadoop-TMP </value> </property></configuration>
ETC/hadoop/hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property></configuration>
Configure Yarn
ETC/hadoop/mapred-site.xml:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>
ETC/hadoop/yarn-site.xml:
Nodemanager loads the shuffle server when it starts. The shuffle server is the jetty/netty server. Reduce tasks use the server to remotely copy the intermediate results generated by map tasks from each nodemanager.
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property></configuration>
Startup Process:
HDFS
$ Bin/HDFS namenode-format (for initial use)
$ sbin/start-dfs.sh
View on the monitoring page-Http: // localhost: 50070/
- Create a folder on HDFS
$ bin/hdfs dfs -mkdir /user $ bin/hdfs dfs -mkdir /user/<username>
View the bin/hadoop FS-ls/folder created on HDFS/
Yarn
$ sbin/start-yarn.sh
View on the monitoring page-Http: // localhost: 8088/
Close:
$ sbin/stop-dfs.sh
$ sbin/stop-yarn.sh
Hadoop 2.5 pseudo-distribution Installation