The installation of this article only covers Hadoop-common, Hadoop-hdfs, Hadoop-mapreduce, and Hadoop-yarn, and does not include hbase, Hive, and pig.
http://blog.csdn.net/aquester/article/details/24621005
1. planning
1.1. list of machines
NameNode |
Secondarynamenode |
Datanodes |
172.16.0.100 |
172.16.0.101 |
172.16.0.110 |
|
|
172.16.0.111 |
|
|
172.16.0.112 |
1.2. host name
Machine IP |
Host Name |
172.16.0.100 |
NameNode |
172.16.0.101 |
Secondarynamenode |
172.16.0.110 |
DataNode110 |
172.16.0.111 |
DataNode111 |
172.16.0.112 |
DataNode112 |
2. set the IP and host name
# Rm-rf/etc/udev/rules.d/*.rules
# Vi/etc/sysconfig/network-scripts/ifcfg-eth0
Device=eth0
Type=ethernet
Onboot=yes
Nm_controlled=yes
Bootproto=static
netmask=255.255.0.0
gateway=192.168.0.6
ipaddr=192.168.1.20
dns1=192.168.0.3
dns2=192.168.0.6
Defroute=yes
Peerdns=yes
Peerroutes=yes
Ipv4_failure_fatal=yes
Ipv6init=no
Name= "System eth0"
# vi/etc/sysconfig/network
Networking=yes
Hostname=namenode.smartmap
# vi/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
192.168.1.20 NameNode Namenode.smartmap
192.168.1.50 Secondarynamenode Secondarynamenode.smartmap
192.168.1.70 DataNode110 Datanode110.smartmap
192.168.1.90 DataNode111 Datanode111.smartmap
:: 1 localhost localhost.localdomain localhost6 localhost6.localdomain6
3. Password- free login
3.1. Password- free login range
Required to be able to password-free login via free login including using IP and host names:
1) namenode password-free login to all Datanode
2) secondarynamenode password-free login to all Datanode
3) Namenode can password-free login
4) Secondarynamenode can password-free login
5) Namenode can password-free login Secondarynamenode
6) Secondarynamenode can password-free login Namenode
7) Datanode can password-free login
8) Datanode does not need to configure password-free login namenode, Secondarynamenode and other datanode.
3.2. Software Installation
# yum Install openssh-clients (NameNode, Secondarynamenode and other datanode are performed)
# yum Install wget
3.3. SSH configuration
Vi/etc/ssh/sshd_config (NameNode, Secondarynamenode and other datanode are performed)
Rsaauthentication Yes
Pubkeyauthentication Yes
Authorizedkeysfile. Ssh/authorized_keys
Service sshd Restart
3.4. SSH No password configuration
# ssh-keygen-t RSA (NameNode, Secondarynamenode and other datanode are performed)
See graphics output, indicating successful key generation, two more files in directory
Private key file: Id_raa
Public key file: id_rsa.pub
Place the public key file Id_rsa.pub content in the Authorized_keys file:
# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys (NameNode, Secondarynamenode are performed)
Distribute the public key file Authorized_keys to each datanode node:
(NameNode, Secondarynamenode are executed)
# SCP Authorized_keys [email protected]:/root/.ssh/
# SCP Authorized_keys [email protected]:/root/.ssh/
# SCP Authorized_keys [email protected]:/root/.ssh/
# SCP Authorized_keys [email protected]:/root/.ssh/
3.5. SSH Login Verification without password
Verify SSH login without password:
(Namenode are executed)
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
(Secondarynamenode are executed)
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
# SSH [email protected]
(DataNode110 are executed)
# SSH [email protected]
(DataNode111 are executed)
# SSH [email protected]
(DataNode111 are executed)
# SSH [email protected]
4.JDK installation and environment variable configuration
(The following namenode, Secondarynamenode and other datanode are performed)
4.1. JDK Download
Jdk-7u72-linux-x64.tar.gz
# SCP jdk-7u72-linux-x64.tar.gz [email protected]:/opt/
4.2. Uninstalling the system's own open source JDK
# Rpm-qa |grep Java
# rpm–e Java
4.3. Copy the installation files to the user directory
For example:
Under the/opt/java directory
4.4. extracting files
# TAR-XZVF Jdk-7u72-linux-x64.tar.gz
After decompression, in the/opt/java directory will generate a new directory jdk1.7.0_72, which is stored in the extracted files.
At this point, the installation is basically done, and here are the settings for the environment variable.
Note: If the file you downloaded is in RPM format, you can install it by using the following command:
RPM-IVH jdk-7u72-linux-x64.rpm
4.5. environment variable settings
Modify the. Profile file (this is recommended so that other programs can also use the JDK in a friendly way)
# Vi/etc/profile
Locate the export PATH USER LOGNAME MAIL HOSTNAME histsize INPUTRC in the file, and change to the following form:
Export java_home=/opt/java/jdk1.7.0_72
Export path= $PATH: $JAVA _home/bin: $JAVA _home/jre/bin
Export classpath=.: $JAVA _home/lib: $JAVA _home/jre/lib
4.6. making environment variables effective
Execute the configuration file to make it effective immediately
# Source/etc/profile
Then execute the following command to verify that the installation is successful
# java-version
If the following message appears, the installation is successful
Java Version "1.7.0_72"
Java (TM) SE Runtime Environment (build 1.7.0_72-b14)
Java HotSpot (TM) 64-bit Server VM (build 24.72-b04, Mixed mode)
5.Hadoop Installation and Configuration
5.1.Hadoop Download
# wget http://mirrors.hust.edu.cn/apache/hadoop/common/stable/hadoop-2.5.1.tar.gz
# SCP hadoop-2.5.1.tar.gz [email protected]:/opt/
5.2. extracting files
# TAR-ZXVF Hadoop-2.5.1.tar.gz
5.3. configuration
# Cd/opt/hadoop-2.5.1/etc/hadoop
cp/opt/hadoop/hadoop-2.5.1/share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml/opt/hadoop/ Hadoop-2.5.1/etc/hadoop/core-site.xml
cp/opt/hadoop/hadoop-2.5.1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml/opt/hadoop/ Hadoop-2.5.1/etc/hadoop/hdfs-site.xml
cp/opt/hadoop/hadoop-2.5.1/share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/yarn-default.xml/opt/hadoop/ Hadoop-2.5.1/etc/hadoop/yarn-site.xml
Cp/opt/hadoop/hadoop-2.5.1/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/ Mapred-default.xml/opt/hadoop/hadoop-2.5.1/etc/hadoop/mapred-site.xml
# mkdir-p/opt/hadoop/tmp/dfs/name
# mkdir-p/opt/hadoop/tmp/dfs/data
# mkdir-p/opt/hadoop/tmp/dfs/namesecondary
5.3.1.core-site.xml
# VI Core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
<description>abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.1.20:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
</configuration>
Property name |
Property value |
Scope of involvement |
Fs.defaultfs |
hdfs://192.168.1.20:9000 |
All nodes |
Hadoop.tmp.dir |
/opt/hadoop/tmp |
All nodes |
Fs.default.name |
hdfs://192.168.1.20:9000 |
|
|
|
|
5.3.2.hdfs-site.xml
# VI Hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.http-address</name>
<value>192.168.1.20:50070</value>
</property>
<property>
<name>dfs.namenode.http-bind-host</name>
<value>192.168.1.20</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.1.50:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
Property name |
Property value |
Scope of involvement |
Dfs.namenode.http-address |
192.168.1.20:50070 |
All nodes |
Dfs.namenode.http-bind-host |
192.168.1.20 |
All nodes |
Dfs.namenode.secondary.http-address |
192.168.1.50:50090 |
NameNode, Secondarynamenode |
Dfs.replication |
2 |
|
|
|
|
|
|
|
|
|
|
2.3.3.mapred-site.xml
# VI Mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>192.168.1.20:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.1.20:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.1.20:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.admin.address</name>
<value>192.168.1.20:10033</value>
</property>
</configuration>
Property name |
Property value |
Scope of involvement |
Mapreduce.framework.name |
Yarn |
All nodes |
Mapreduce.jobtracker.http.address |
192.168.1.20:50030 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2.3.4.yarn-site.xml
# VI Yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.1.20</value>
</property>
</configuration>
Property name |
Property value |
Scope of involvement |
Yarn.resourcemanager.hostname |
192.168.1.20 |
All nodes |
Yarn.nodemanager.aux-services |
Mapreduce_shuffle |
All nodes |
Yarn.nodemanager.hostname |
0.0.0.0 |
All nodes |
|
|
|
|
|
|
|
|
|
|
|
|
5.3.5.slaves
# VI Slaves
DataNode110
DataNode111
5.3.5.secondaryNamenodes
# VI Master
Secondarynamenode
5.3.6. modifying java_home
Add Java_home configuration in Files hadoop-env.sh and yarn-env.sh, respectively
# VI hadoop-env.sh
Export java_home=/opt/java/jdk1.7.0_72
Export hadoop_home=/opt/hadoop/hadoop-2.5.1
Export Hadoop_common_lib_native_dir=${hadoop_home}/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _home/lib/native"
# VI yarn-env.sh
Export java_home=/opt/java/jdk1.7.0_72
Export hadoop_home=/opt/hadoop/hadoop-2.5.1
Export Hadoop_common_lib_native_dir=${hadoop_home}/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _home/lib/native"
5.4. environment variable settings
Modify the. Profile file (this is recommended so that other programs can also use the JDK in a friendly way)
# Vi/etc/profile
Locate the export PATH USER LOGNAME MAIL HOSTNAME histsize INPUTRC in the file, and change to the following form:
Export java_home=/opt/java/jdk1.7.0_72
Export Jre_home= $JAVA _home/jre
Export path= $JAVA _home/bin: $JRE _home/bin: $PATH
Export classpath=.: $JAVA _home/lib: $JAVA _home/jre/lib: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export hadoop_home=/opt/hadoop/hadoop-2.5.1
Export Hadoop_common_home= $HADOOP _home
Export Hadoop_hdfs_home= $HADOOP _home
Export Hadoop_mapred_home= $HADOOP _home
Export Hadoop_yarn_home= $HADOOP _home
Export hadoop_conf_dir= $HADOOP _home/etc/hadoop
Export classpath= $HADOOP _home/lib: $CLASSPATH
Export path= $HADOOP _home/bin: $HADOOP _home/sbin: $PATH
Export hadoop_common_lib_native_dir= $HADOOP _home/lib/native
Export hadoop_opts= "-djava.library.path=/opt/hadoop/hadoop-2.5.1/lib/native"
5.4.1. making environment variables effective
Execute the configuration file to make it effective immediately
# Source/etc/profile
5.5. starting HDFS
5.5.1. formatting NameNode
# HDFs Namenode-format
5.5.1. starting HDFS
. /opt/hadoop/hadoop-2.5.1/sbin/start-dfs.sh
5.5.1. starting YARN
. /opt/hadoop/hadoop-2.5.1/sbin/start-yarn.sh
Set the logger level to see the specific reason
Export Hadoop_root_logger=debug,console
Windows->show view->other-> MapReduce tools->map/reduce Locations
Hadoop-2.5.1-src.tar.gz\hadoop-2.5.1-src\hadoop-mapreduce-project\hadoop-mapreduce-examples\src\main\java\org\ Apache\hadoop\examples-tar+gzip Archive, unpacked size 82,752,131 bytes
Hadoop 2.5.1 Cluster installation configuration