Build the Hadoop 2.6.3 fully distributed environment on the CentOS 6.7 x64 and test successfully on the Digitalocean.
This article assumes:
Master node (NameNode) domain name (hostname): m.fredlab.org child node (DataNode) domain name (hostname): s1.fredlab.org s2.fredlab.org s3.fredlab.org
First, configure SSH Trust
1. Generate public private key on Master machine: Id_rsa and Id_rsa.pub
Ssh-keygen
2, upload to each node machine's. ssh/Directory
. ssh/---- |-- |--id_rsa |--id_rsa.pub |--authorized_keys |--
3. Change the private key permission to 0600
chmod 0600 Id_rsa
4. Copy the public key to Authorized_keys
cat id_rsa.pub >> Authorized_keys
5. Configure SSH Free Inquiry (yes) login (optional)
Modify the following two lines of/etc/ssh/ssh_config
Stricthostkeychecking Nogssapiauthentication No
Restart the SSH service
Service sshd Restart
Second, configure the domain name (hostname) for each host
1. Modify Host Name
Vim/etc/sysconfig/network
Modify hostname=s1.fredlab.org (hostname), other nodes the same.
2, the host name and corresponding IP address to join/etc/hosts, similar format:
104.236. 142.235 m.fredlab.org104.236. 143.22 s1.fredlab.org104.236. 143.54 s2.fredlab.org107.170. 224.199 s3.fredlab.org
Third, install the Java JDK Environment
1. Download Java JDK rpm package
Http://www.oracle.com/technetwork/java/javase/downloads/index.html
2. Installation
Rpm-ih jdk-8u72-linux-x64.rpm
3, verify the Java path, and version
which Java which -version
4. Default Java_home=/usr
Iv. installation of Hadoop 2.6.3
"The following operations are performed on master"
1. Download Hadoop 2.6.3
wget http://www.eu.apache.org/dist/hadoop/common/hadoop-2.6.3/hadoop-2.6.3.tar.gz
2, decompression installation (installation position/USR/LOCAL/HADOOP2)
tar zxf hadoop-2.6. 3. Tar . GZ MV hadoop-2.6. 3 /usr/local/hadoop2
3, add Data directory and temporary directory (location optional, corresponding to the configuration file)
mkdir /usr/local/hadoop2/tmpmkdir /usr/local/hadoop2/hdfs
4. Modify the configuration file (located in the/usr/local/hadoop2/etc/hadoop/directory)
Master profile: Core-site, where m.fredlab.org is the Namenode domain name (host name)
<configuration> <property> <name>hadoop.tmp. Dir</name> <value>/usr/local/hadoop2/tmp</value> dir</description > </property> <property> <name>fs.defaultFS</name> <value> HDFs://m.fredlab.org:9000</value> </property> <property> <name>io. file.buffer.size</name> <value>131072</value> </property ></configuration>
HDFs configuration file: Hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name. dir </name> <value>file :/ usr/local/hadoop2/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir </name> <value>file :/usr/local/hadoop2/hdfs/data </value> </property> <property> <name>dfs.blocksize</name> <value>134217728 </value> </property> <property> <name>dfs.namenode.handler.count</ name> <value>10 </value> </property> <property> &L t;name>dfs.replication</name> <value>2 </value> </ Property></configuration>
Yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> < value>mapreduce_shuffle</value> </property> </configuration>
Mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value >yarn</value> </property></configuration>
Adding Java paths to hadoop-env.sh and yarn-env.sh
Echo " Export JAVA_HOME=/USR " >> hadoop-env. SH Echo " Export JAVA_HOME=/USR " >> yarn-env. SH
Add the domain name (hostname) of each node in the slaves file, one per line, similar to:
s1.fredlab.orgs2.fredlab.orgs3.fredlab.org
V. Replicate the configured Hadoop to each slaves
Copy the/USR/LOCAL/HADOOP2 directory to each datanode machine
SCP -r/usr/local/hadoop2 [email protected]:/usr/local/
Vi. Starting a Hadoop cluster
1. Format File system
/usr/local/hadoop2/bin/dfs Namenode-format
2. Start the cluster
Start HDFs
/usr/local/hadoop2/sbin/start-dfs. SH
Start yarn
/usr/local/hadoop2/sbin/start-yarn. SH
3. Check the file system
/usr/local/hadoop2/bin/hdfs Dfsadmin-report
4, the Web page observation cluster operation situation
Cluster running status: http://m.fredlab.org:50070
http://master_ip:50070
Cluster Application Status: http://m.fredlab.org:8088
http://master_ip:8088
5. View each node Java process
namenode,m.fredlab.org on execution: JPS
12058 ResourceManager 22298 NameNode 11914 Secondarynamenode 11180 Jps
datanode,s1.fredlab.org on execution: JPS
13909 Jps 13494 DataNode
CentOS 6.7 Installs Hadoop 2.6.3 cluster environment