Run hadoop2.4.0 single-node configuration in Ubuntu

Source: Internet
Author: User
Tags hadoop fs

The hosts has not been modified. Follow the previous steps to modify the hosts.

If Java has not been installed, follow the previous configuration.

(1) Add a user and set up a public key:

Sudo addgroup hadoop

Sudo adduser -- ingroup hadoop hduser

Su-hduser

Cat $ home/. Ssh/id_rsa.pub> $ home/. Ssh/authorized_keys

SSH localhost

Exit

(2) copy the compiled hadoop to the/usr/local directory and modify the directory permissions.

CP-r/root/hadoop-2.4.0-src/hadoop-Dist/target/hadoop-2.4.0/usr/local

CD/usr/local

Chown-r hduser: hadoop-2.4.0 for hadoop

(3) Disable Ipv6

Su

VI/etc/sysctl. conf

Join:

Net. ipv6.conf. All. disable_ipv6 = 1

Net. ipv6.conf. Default. disable_ipv6 = 1

Net. ipv6.conf. Lo. disable_ipv6 = 1

Restart:

Reboot

Test:

CAT/proc/sys/NET/IPv6/CONF/All/disable_ipv6

Output 1 indicates that IPv6 is disabled.

(4) modify the startup configuration file ~ /. Bashrc

Su hduser

VI ~ /. Bashrc

Add the following code:

Java_home =/usr/lib/JVM/jdk1.7.0 _ 55

Jre_home =$ {java_home}/JRE

Export android_java_home = $ java_home

Export classpath =. :$ {java_home}/lib: $ jre_home/lib :$ {java_home}/lib/tools. jar: $ classpath

Export java_path =$ {java_home}/bin: $ {jre_home}/bin

Export java_home;

Export jre_home;

Export classpath;

Home_bin = ~ /Bin/

Export path =$ {path }:$ {java_path }:$ {home_bin };

Export path =$ {java_home}/bin: $ path

Export hadoop_home =/usr/local/hadoop-2.4.0

Unalias FS &>/dev/null

Alias FS = "hadoop FS"

Unalias HLS &>/dev/null

Alias HLS = "FS-ls"

Lzohead (){

Hadoop FS-cat $1 | lzop-DC | header-1000 | less

}

Export Path = $ path: $ hadoop_home/bin

Export hadoop_common_lib_native_dir = $ hadoop_home/lib/native

# Export hadoop_opts = -djava.net. preferipv4stack = true

Export hadoop_opts = "-djava. Library. Path = $ hadoop_home/lib"

Export hadoop_mapred_home = $ hadoop_home

Export hadoop_common_home = $ hadoop_home

Export hadoop_hdfs_home = $ hadoop_home

Export yarn_home = $ hadoop_home

Export hadoop_conf_dir = $ hadoop_home/etc/hadoop

Make the modification take effect:

Source ~ /. Bashrc

 

(5) Create the datanode and namenode directories in the hadoop directory.

Mkdir-p $ hadoop_home/yarn/yarn_data/HDFS/namenode
Mkdir-p $ hadoop_home/yarn/yarn_data/HDFS/datanode

 

(6) Modify hadoop configuration parameters

For convenience, CD $ hadoop_conf_dir

Run the following command in $ hadoop_home:

Vi etc/hadoop/hadoop-env.sh

Add java_home variable

Export java_home =/usr/lib/JVM/jdk1.7.0 _ 55

Vi etc/hadoop/yarn-site.xml

Add the following information:

<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>

Create hadoop. tmp. dir

Sudo mkdir-P/APP/hadoop/tmp

(If an error occurs: hduser is not in the sudoers file. This incident will be reported.

Su

VI/etc/sudoers

Add hduser all = (all) All

)

# Sudo chown hduser: hadoop/APP/hadoop/tmp

Sudo chown-r hduser: hadoop/APP

Sudo chmod 750/APP/hadoop/tmp

CD $ hadoop_home

vi etc/hadoop/core-site.xml

<property><name>hadoop.tmp.dir</name><value>/app/hadoop/tmp</value><description>A base for other temporary directories.</description></property><property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property>

 

Vi etc/hadoop/hdfs-site.xml

<property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop-2.4.0/yarn/yarn_data/hdfs/namenode</value></property><property><name>dfs.datanode.data.dir</name><value>file:/usr/local/hadoop-2.4.0/yarn/yarn_data/hdfs/datanode</value></property>

 

Vi etc/hadoop/mapred-site.xml

<property><name>mapreduce.framework.name</name><value>yarn</value></property>

Good

 

(7) format the namenode node:

Bin/hadoop namenode-format

 

(8) hadoop running example

Sbin/hadoop-daemon.sh start namenode

Sbin/hadoop-daemon.sh start datanode

Sbin/hadoop-daemon.sh start secondarynamenode

Sbin/yarn-daemon.sh start ResourceManager

Sbin/yarn-daemon.sh start nodemanager

Sbin/mr-jobhistory-daemon.sh start historyserver

 

(9) monitoring operation:

JPS

Netstat-ntlp

Http: // localhost: 50070/For namenode

Http: // localhost: 8088/cluster for ResourceManager

HTTP: /localhost: 19888/jobhistory for job history Server

 

(10) error handling:

Log File storage directory:

CD $ hadoop_home/logs

Or go to the namenode page to view the log

Http: // 192.168.85.136: 50070/logs/hadoop-hduser-datanode-ubuntu.log

 

1. Error:

After datanode is started, the JPS process disappears. Read the following page to view the log,

Http: // 192.168.85.136: 50070/logs/hadoop-hduser-datanode-ubuntu.log

The error message is as follows:

03:03:41, 446 fatal Org. apache. hadoop. HDFS. server. datanode. datanode: initialization failed for block pool <registering> (datanode UUID unassigned) Service to localhost/FIG: 9000

Problems found :. /bin/hadoop namenode-format re-creates a namenodeid, And the tmp/dfs/data directory that stores datanode data contains the ID of the last format, namenode format clears the data in namenode, but does not clear the data in datanode. As a result, the startup fails. All you need to do is to clear all the directories in TMP before each fotmat.

Reference: http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly

Solution:

Rm-RF/usr/local/hadoop-2.4.0/yarn/yarn_data/HDFS /*

./Bin/hadoop namenode-format

 

2. Warning debugging:

14/07/03 06:13:25 warn util. nativecodeloader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable

Debugging:

Export hadoop_root_logger = debug, console

Hadoop FS-text/test/data/origz/access.log.gz

Solution:

CP/usr/local/hadoop-2.4.0/lib/native/*/usr/local/hadoop-2.4.0/lib/

 

(11) create a text file and put it into HDFS:

Mkdir in

VI in/File

Hadoop is fast

Hadoop is cool

Bin/hadoop DFS-copyfromlocal in // in

 

(12) run the wordcount sample program:

Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar wordcount/in/out

 

(13) view the running result:

Bin/hadoop FS-ls/out

Bin/hadoop DFS-CAT/out/part-r-00000

Alternatively, you can go to the namenode website to query

Http: // localhost: 50070/dfshealth. jsp

 

(14) Disable Demo:

Sbin/hadoop-daemon.sh stop namenode

Sbin/hadoop-daemon.sh stop datanode

Sbin/hadoop-daemon.sh stop secondarynamenode

Sbin/yarn-daemon.sh stop ResourceManager

Sbin/yarn-daemon.sh stop nodemanager

Sbin/mr-jobhistory-daemon.sh stop historyserver

 

This article references two very good blog articles, which are listed below for reference:

Http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce.html

Http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.