The hosts has not been modified. Follow the previous steps to modify the hosts.
If Java has not been installed, follow the previous configuration.
(1) Add a user and set up a public key:
Sudo addgroup hadoop
Sudo adduser -- ingroup hadoop hduser
Su-hduser
Cat $ home/. Ssh/id_rsa.pub> $ home/. Ssh/authorized_keys
SSH localhost
Exit
(2) copy the compiled hadoop to the/usr/local directory and modify the directory permissions.
CP-r/root/hadoop-2.4.0-src/hadoop-Dist/target/hadoop-2.4.0/usr/local
CD/usr/local
Chown-r hduser: hadoop-2.4.0 for hadoop
(3) Disable Ipv6
Su
VI/etc/sysctl. conf
Join:
Net. ipv6.conf. All. disable_ipv6 = 1
Net. ipv6.conf. Default. disable_ipv6 = 1
Net. ipv6.conf. Lo. disable_ipv6 = 1
Restart:
Reboot
Test:
CAT/proc/sys/NET/IPv6/CONF/All/disable_ipv6
Output 1 indicates that IPv6 is disabled.
(4) modify the startup configuration file ~ /. Bashrc
Su hduser
VI ~ /. Bashrc
Add the following code:
Java_home =/usr/lib/JVM/jdk1.7.0 _ 55
Jre_home =$ {java_home}/JRE
Export android_java_home = $ java_home
Export classpath =. :$ {java_home}/lib: $ jre_home/lib :$ {java_home}/lib/tools. jar: $ classpath
Export java_path =$ {java_home}/bin: $ {jre_home}/bin
Export java_home;
Export jre_home;
Export classpath;
Home_bin = ~ /Bin/
Export path =$ {path }:$ {java_path }:$ {home_bin };
Export path =$ {java_home}/bin: $ path
Export hadoop_home =/usr/local/hadoop-2.4.0
Unalias FS &>/dev/null
Alias FS = "hadoop FS"
Unalias HLS &>/dev/null
Alias HLS = "FS-ls"
Lzohead (){
Hadoop FS-cat $1 | lzop-DC | header-1000 | less
}
Export Path = $ path: $ hadoop_home/bin
Export hadoop_common_lib_native_dir = $ hadoop_home/lib/native
# Export hadoop_opts = -djava.net. preferipv4stack = true
Export hadoop_opts = "-djava. Library. Path = $ hadoop_home/lib"
Export hadoop_mapred_home = $ hadoop_home
Export hadoop_common_home = $ hadoop_home
Export hadoop_hdfs_home = $ hadoop_home
Export yarn_home = $ hadoop_home
Export hadoop_conf_dir = $ hadoop_home/etc/hadoop
Make the modification take effect:
Source ~ /. Bashrc
(5) Create the datanode and namenode directories in the hadoop directory.
Mkdir-p $ hadoop_home/yarn/yarn_data/HDFS/namenode
Mkdir-p $ hadoop_home/yarn/yarn_data/HDFS/datanode
(6) Modify hadoop configuration parameters
For convenience, CD $ hadoop_conf_dir
Run the following command in $ hadoop_home:
Vi etc/hadoop/hadoop-env.sh
Add java_home variable
Export java_home =/usr/lib/JVM/jdk1.7.0 _ 55
Vi etc/hadoop/yarn-site.xml
Add the following information:
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
Create hadoop. tmp. dir
Sudo mkdir-P/APP/hadoop/tmp
(If an error occurs: hduser is not in the sudoers file. This incident will be reported.
Su
VI/etc/sudoers
Add hduser all = (all) All
)
# Sudo chown hduser: hadoop/APP/hadoop/tmp
Sudo chown-r hduser: hadoop/APP
Sudo chmod 750/APP/hadoop/tmp
CD $ hadoop_home
vi etc/hadoop/core-site.xml
<property><name>hadoop.tmp.dir</name><value>/app/hadoop/tmp</value><description>A base for other temporary directories.</description></property><property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property>
Vi etc/hadoop/hdfs-site.xml
<property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop-2.4.0/yarn/yarn_data/hdfs/namenode</value></property><property><name>dfs.datanode.data.dir</name><value>file:/usr/local/hadoop-2.4.0/yarn/yarn_data/hdfs/datanode</value></property>
Vi etc/hadoop/mapred-site.xml
<property><name>mapreduce.framework.name</name><value>yarn</value></property>
Good
(7) format the namenode node:
Bin/hadoop namenode-format
(8) hadoop running example
Sbin/hadoop-daemon.sh start namenode
Sbin/hadoop-daemon.sh start datanode
Sbin/hadoop-daemon.sh start secondarynamenode
Sbin/yarn-daemon.sh start ResourceManager
Sbin/yarn-daemon.sh start nodemanager
Sbin/mr-jobhistory-daemon.sh start historyserver
(9) monitoring operation:
JPS
Netstat-ntlp
Http: // localhost: 50070/For namenode
Http: // localhost: 8088/cluster for ResourceManager
HTTP: /localhost: 19888/jobhistory for job history Server
(10) error handling:
Log File storage directory:
CD $ hadoop_home/logs
Or go to the namenode page to view the log
Http: // 192.168.85.136: 50070/logs/hadoop-hduser-datanode-ubuntu.log
1. Error:
After datanode is started, the JPS process disappears. Read the following page to view the log,
Http: // 192.168.85.136: 50070/logs/hadoop-hduser-datanode-ubuntu.log
The error message is as follows:
03:03:41, 446 fatal Org. apache. hadoop. HDFS. server. datanode. datanode: initialization failed for block pool <registering> (datanode UUID unassigned) Service to localhost/FIG: 9000
Problems found :. /bin/hadoop namenode-format re-creates a namenodeid, And the tmp/dfs/data directory that stores datanode data contains the ID of the last format, namenode format clears the data in namenode, but does not clear the data in datanode. As a result, the startup fails. All you need to do is to clear all the directories in TMP before each fotmat.
Reference: http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly
Solution:
Rm-RF/usr/local/hadoop-2.4.0/yarn/yarn_data/HDFS /*
./Bin/hadoop namenode-format
2. Warning debugging:
14/07/03 06:13:25 warn util. nativecodeloader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
Debugging:
Export hadoop_root_logger = debug, console
Hadoop FS-text/test/data/origz/access.log.gz
Solution:
CP/usr/local/hadoop-2.4.0/lib/native/*/usr/local/hadoop-2.4.0/lib/
(11) create a text file and put it into HDFS:
Mkdir in
VI in/File
Hadoop is fast
Hadoop is cool
Bin/hadoop DFS-copyfromlocal in // in
(12) run the wordcount sample program:
Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar wordcount/in/out
(13) view the running result:
Bin/hadoop FS-ls/out
Bin/hadoop DFS-CAT/out/part-r-00000
Alternatively, you can go to the namenode website to query
Http: // localhost: 50070/dfshealth. jsp
(14) Disable Demo:
Sbin/hadoop-daemon.sh stop namenode
Sbin/hadoop-daemon.sh stop datanode
Sbin/hadoop-daemon.sh stop secondarynamenode
Sbin/yarn-daemon.sh stop ResourceManager
Sbin/yarn-daemon.sh stop nodemanager
Sbin/mr-jobhistory-daemon.sh stop historyserver
This article references two very good blog articles, which are listed below for reference:
Http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce.html
Http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/