Centos6.4 install hadoop-2.5.1 (fully distributed)

Last Update:2014-10-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Environment Introduction:

Install a Hadoop-2.5.1 distributed cluster on two servers with centos6.4 (32-bit) (2 machines for trial use, haha ).

1. Modify the Host Name and/etc/hosts file

1) modify the Host Name (optional)

vi /etc/sysconfig/networkHOSTNAME=XXX

It takes effect after restart.

2)/etc/hosts is the ip address and its corresponding host name file to let the machine know the ing between the ip address and the host name. The format is as follows:

#IPAddress HostName192.168.1.67 MasterServer192.168.1.241 SlaveServer

2. Configure password-free login to SSH

1) generate a key:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

The preceding two single quotes are used.

2) append id_dsa.pub (Public key) to the authorization key:

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3) copy the authentication file to another node:

scp ~/.ssh/authorized_keys hadooper@192.168.1.241:~/.ssh/

4) test:

ssh SlaveServer

To confirm the connection for the first time, enter yes.

But I still need to enter the password, because. ssh and authorized_keys permissions are not correct, see: http://blog.csdn.net/hwwn2009/article/details/39852457

3. Install jdk on each node
1) The selected version is jdk-6u27-linux-i586.bin,: http://pan.baidu.com/s/1mgICcFA
2) Upload the file to the hadooper user directory and add the execution permission.

chmod 777 jdk-6u27-linux-i586.bin

3) Installation

./jdk-6u27-linux-i586.bin

4) configure the environment variables: Add vi/etc/profile to the following three lines:

#JAVA_HOMEexport JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27export PATH=$JAVA_HOME/bin:$PATH

5) execute source/etc/profile to make the configuration of environment variables take effect.
6) run java-version to check whether the jdk version is successful.

4. Hadoop Installation

Hadoop must be installed on each node. Upload hadoop-2.5.1.tar.gz to the user's hadooper directory.

1) decompress

tar -zvxf hadoop-2.5.1.tar.gz

2) Add the environment variable vi/etc/profile. Add the following at the end:

export HADOOP_HOME=/home/hadooper/hadoop/hadoop-2.5.1export HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport CLASSPATH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib:$CLASSPATHexport PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

The setting takes effect immediately:

source /etc/profile

3) modify the Hadoop configuration file

(1) core-site.xml

<property>    <name>fs.defaultFS</name>    <value>hdfs://MasterServer:9000</value></property>

(2) hdfs-site.xml

 <property>      <name>dfs.replication</name>      <value>3</value>   </property>

(3) mapred-site.xml

<property>    <name>mapreduce.framework.name</name>    <value>yarn</value> </property> <property>    <name>mapreduce.jobhistory.address</name>    <value>MasterServer:10020</value> </property> <property>  <name>mapreduce.jobhistory.webapp.address</name>  <value>MasterServer:19888</value> </property><span style="font-family: Arial, Helvetica, sans-serif;">         </span>

Jobhistory is a Hadoop built-in history server that records Mapreduce history jobs. By default, jobhistory is not started and can be started with the following command:

 sbin/mr-jobhistory-daemon.sh start historyserver

(4) yarn-site.xml

 <property>      <name>yarn.nodemanager.aux-services</name>      <value>mapreduce_shuffle</value>   </property>   <property>      <name>yarn.resourcemanager.address</name>      <value>MasterServer:8032</value>   </property>   <property>      <name>yarn.resourcemanager.scheduler.address</name>      <value>MasterServer:8030</value>   </property>   <property>      <name>yarn.resourcemanager.resource-tracker.address</name>      <value>MasterServer:8031</value>   </property>   <property>      <name>yarn.resourcemanager.admin.address</name>      <value>MasterServer:8033</value>   </property>   <property>      <name>yarn.resourcemanager.webapp.address</name>      <value>MasterServer:8088</value>   </property>

(5) slaves

SlaveServer

(6) add JAVA_HOME in hadoop-env.sh and yarn-env.sh respectively

export JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27

5. Run Hadoop

1) Format

hdfs namenode –format

2) Start Hadoop

start-dfs.sh start-yarn.sh

You can also run the following command:

start-all.sh

3) Stop Hadoop

stop-all.sh

4) view processes in jps

7692 ResourceManager8428 JobHistoryServer7348 NameNode14874 Jps7539 SecondaryNameNode

5) view the cluster running status in a browser

(1) http: // 192.168.1.67: 50070

(2) http: // 192.168.1.67: 8088/

(3) http: // 192.168.1.67: 19888

6. Run the wordcount example provided by Hadoop.

1) Create an input file:

echo "My first hadoop example. Hello Hadoop in input. " > input

2) create a directory

hadoop fs -mkdir /user/hadooper

3) upload files

hadoop fs -put input /user/hadooper

4) execute the wordcount Program

 hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /user/hadooper/input /user/hadooper/output

5) view results

hadoop fs -cat /user/hadooper/output/part-r-00000

Hadoop1My1example.Hello1first1hadoop1in1input.1

Reprinted Please note: http://blog.csdn.net/hwwn2009/article/details/39889465

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Centos6.4 install hadoop-2.5.1 (fully distributed)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support