Centos6.4 install hadoop-2.5.1 (fully distributed)

Source: Internet
Author: User

Centos6.4 install hadoop-2.5.1 (fully distributed)

Environment Introduction:

Install a Hadoop-2.5.1 distributed cluster on two servers with centos6.4 (32-bit) (2 machines for trial use, haha ).

1. Modify the Host Name and/etc/hosts file

1) modify the Host Name (optional)

vi /etc/sysconfig/networkHOSTNAME=XXX

It takes effect after restart.

2)/etc/hosts is the ip address and its corresponding host name file to let the machine know the ing between the ip address and the host name. The format is as follows:

#IPAddress HostName192.168.1.67 MasterServer192.168.1.241 SlaveServer

2. Configure password-free login to SSH

1) generate a key:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
The preceding two single quotes are used.

2) append id_dsa.pub (Public key) to the authorization key:

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3) copy the authentication file to another node:

scp ~/.ssh/authorized_keys hadooper@192.168.1.241:~/.ssh/

4) test:

ssh SlaveServer 

To confirm the connection for the first time, enter yes.

But I still need to enter the password, because. ssh and authorized_keys permissions are not correct, see: http://blog.csdn.net/hwwn2009/article/details/39852457

3. Install jdk on each node
1) The selected version is jdk-6u27-linux-i586.bin,: http://pan.baidu.com/s/1mgICcFA
2) Upload the file to the hadooper user directory and add the execution permission.

chmod 777 jdk-6u27-linux-i586.bin
3) Installation
./jdk-6u27-linux-i586.bin
4) configure the environment variables: Add vi/etc/profile to the following three lines:
#JAVA_HOMEexport JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27export PATH=$JAVA_HOME/bin:$PATH
5) execute source/etc/profile to make the configuration of environment variables take effect.
6) run java-version to check whether the jdk version is successful.

4. Hadoop Installation

Hadoop must be installed on each node. Upload hadoop-2.5.1.tar.gz to the user's hadooper directory.

1) decompress
tar -zvxf hadoop-2.5.1.tar.gz
2) Add the environment variable vi/etc/profile. Add the following at the end:
export HADOOP_HOME=/home/hadooper/hadoop/hadoop-2.5.1export HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport CLASSPATH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib:$CLASSPATHexport PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
The setting takes effect immediately:
source /etc/profile
3) modify the Hadoop configuration file

(1) core-site.xml

<property>    <name>fs.defaultFS</name>    <value>hdfs://MasterServer:9000</value></property>
(2) hdfs-site.xml
 <property>      <name>dfs.replication</name>      <value>3</value>   </property>  
(3) mapred-site.xml
<property>    <name>mapreduce.framework.name</name>    <value>yarn</value> </property> <property>    <name>mapreduce.jobhistory.address</name>    <value>MasterServer:10020</value> </property> <property>  <name>mapreduce.jobhistory.webapp.address</name>  <value>MasterServer:19888</value> </property><span style="font-family: Arial, Helvetica, sans-serif;">         </span>

Jobhistory is a Hadoop built-in history server that records Mapreduce history jobs. By default, jobhistory is not started and can be started with the following command:

 sbin/mr-jobhistory-daemon.sh start historyserver
(4) yarn-site.xml
 <property>      <name>yarn.nodemanager.aux-services</name>      <value>mapreduce_shuffle</value>   </property>   <property>      <name>yarn.resourcemanager.address</name>      <value>MasterServer:8032</value>   </property>   <property>      <name>yarn.resourcemanager.scheduler.address</name>      <value>MasterServer:8030</value>   </property>   <property>      <name>yarn.resourcemanager.resource-tracker.address</name>      <value>MasterServer:8031</value>   </property>   <property>      <name>yarn.resourcemanager.admin.address</name>      <value>MasterServer:8033</value>   </property>   <property>      <name>yarn.resourcemanager.webapp.address</name>      <value>MasterServer:8088</value>   </property>  
(5) slaves
SlaveServer
(6) add JAVA_HOME in hadoop-env.sh and yarn-env.sh respectively
export JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27

5. Run Hadoop

1) Format

hdfs namenode –format
2) Start Hadoop
start-dfs.sh start-yarn.sh
You can also run the following command:
start-all.sh
3) Stop Hadoop
stop-all.sh
4) view processes in jps
7692 ResourceManager8428 JobHistoryServer7348 NameNode14874 Jps7539 SecondaryNameNode
5) view the cluster running status in a browser

(1) http: // 192.168.1.67: 50070

 

(2) http: // 192.168.1.67: 8088/

(3) http: // 192.168.1.67: 19888

6. Run the wordcount example provided by Hadoop.

1) Create an input file:

echo "My first hadoop example. Hello Hadoop in input. " > input
2) create a directory
hadoop fs -mkdir /user/hadooper
3) upload files
hadoop fs -put input /user/hadooper
4) execute the wordcount Program
 hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /user/hadooper/input /user/hadooper/output
5) view results
hadoop fs -cat /user/hadooper/output/part-r-00000
Hadoop1My1example.Hello1first1hadoop1in1input.1
Reprinted Please note: http://blog.csdn.net/hwwn2009/article/details/39889465

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.