1. Download: Https://dist.apache.org/repos/dist/release/hadoop/commo
2. Detailed: http://blog.csdn.net/aircraftxjd/article/details/44624219 One, installation environment
1. Virtual machine: VMware-12.5.2
2. Operating system: Linux CentOs-6.8 64 bits
3.JDK:JDK-7U79 64-bit
4.hadoop version: hadoop-2.7.1
5. Install three CentOs-6.8, where: one server two clients
master:192.168.1.149
node2:192.168.1.150
node3:192.168.1.151 Two, modify host name (not necessary)
1. Modify host name hostname
A) vi/etc/sysconfig/network
b) Save exit: Esc:wq
c) Restart effective: reboot
D) You can see that the host name has changed after the reboot
2. Hosts files are configured on three nodes
A) Edit: vi/etc/hosts
B Save exit: Esc:wq Three, configure SSH password-free login
Because Hadoop needs to log in to the various nodes through SSH, I use root, each server generates a public key, and then merges into the Authorized_keys
1.CentOS default does not start SSH no secret login, remove/etc/ssh/sshd_config 2 lines of comments, each server to set
A) Edit: Vim/etc/ssh/sshd_config
#RSAAuthentication Yes
#PubkeyAuthentication Yes
b) Save exit
2. Input command: ssh-keygen-t RSA generates key, does not enter the password, has been carriage return,/root will generate. SSH folder, each server set
3. Merge public key to Authorized_keys file, on master server, enter/root/.ssh directory, merge via SSH command
A) cd/root/.ssh/
b) Cat id_rsa.pub>> Authorized_keys
c) SSH root@192.168.1.150 cat ~/.ssh/id_rsa.pub>> Authorized_keys
d) SSH root@192.168.1.151 cat ~/.ssh/id_rsa.pub>> Authorized_keys
4. Copy the master server's authorized_keys,known_hosts to the/root/.ssh directory of the slave server
A) scp-r Authorized_keys known_hosts root@192.168.1.150:/root/.ssh/
b) scp-r Authorized_keys known_hosts root@192.168.1.151:/root/.ssh/
5. Complete, SSH root@192.168.1.150,ssh root@192.168.1.151 will not need to enter the password
Four, install JDK
Each server has to be installed
1. New two folders in the/home directory Java Hadoop
2. Put the JDK in the Java folder and unzip the configuration environment variables
A) First uninstall the own JDK
(1) View Jkd:rpm-qa | grep JDK
(2) Uninstall: RPM-E Copy to find out the first line example: Rpm-e java-1.7.0-openjdk-1.7.0.99-2.6.5.1.el6.x86_64
b) Decompression: TAR-ZXVF jdk-7u79-linux-x64.tar.gz
c) into the decompression folder and pwd the copy path
d) Edit the configuration environment variable: Vim/etc/profile and add the following figure
Export java_home=/home/java/jdk1.7.0_79
Export classpath=.: $JAVA _home/jre/lib/rt.jar: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export path= $PATH: $JAVA _home/bin
e) Save exit: Press Esc:wq
(f) Making environment variables effective: source/etc/profile
g) View Jdk:java-version
Five, download and install Hadoop
Place the downloaded Hadoop file in the root directory of the/home/hadoop user, and then start the decompression
1. Decompression: TAR-ZXVF hadoop-2.7.1.tar.gz
2. Create a folder for data storage in the/home/hadoop directory Tmp,hdfs,hdfs/data,hdfs/name
A) mkdir tmp HDFs hdfs/data hdfs/name
3. Modify configuration file (configured in master Master node)
Please note that the file above the master node (master) must overwrite the other two nodes (NODE2,NODE3) if modified
Scp-r/home/hadoop/hadoop-2.7.1/etc/hadoop/* root@192.168.1.150:/home/hadoop/hadoop-2.7.1/etc/hadoop/
Scp-r/home/hadoop/hadoop-2.7.1/etc/hadoop/* root@192.168.1.151:/home/hadoop/hadoop-2.7.1/etc/hadoop/
A) are core.site.xml,hdfs-site.xml,mapred-site.xml,hadoop-env.sh,yarn-env.sh,slaves files respectively
b) Enter this folder: cd/home/hadoop/hadoop-2.7.1/etc/hadoop/
c) Edit: Vim core-site.xml Add:
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.1.149:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
d) Edit: Vim hdfs-site.xml Add:
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>1048576</value>
</property>
<property>
<name>file:dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!--The following is the configuration Secondarynamenode, if the default is not configured on the main node, and Namenode and Secondarynamenode to be configured separately, here I configured to 150 above-->
<property>
<name>dfs.http.address</name>
<value>192.168.1.149:50070</value>
<description>
The address and the base port where the Dfs Namenode Web UI would listen on.
If the port is 0 then the server would start on a free port.
</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.1.150:50090</value>
</property>
e) Edit: Vim mapred-site.xml Add
Note: There may not be mapred-site.xml This file is only mapred-site.xml.template then we need to make a copy and change it to Mapred-site.xml.
Execute command: CP mapred-site.xml.template Mapred-site.xml
And then add
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.1.149:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.1.149:19888</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>10</value>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>4</value>
</property>
</configuration>
f) Configure the Java_home under hadoop-env.sh,yarn-env.sh two files
Export java_home=/home/java/jdk1.7.0_79
g) Finally configure the slaves file, remove the default localhost, and add 2 from the node
192.168.1.150
192.168.1.151
4. Copy the configured Hadoop to the corresponding locations of each node, via the SCP
Scp-r/home/hadoop/192.168.1.150:/home/
Scp-r/home/hadoop/192.168.1.151:/home/
5. Start Hadoop at master server, start automatically from node, enter/home/hadoop/hadoop-2.7.0 directory
(a) initialization: Bin/hdfs Namenode-format
b) all start sbin/start-all.sh, also can separate sbin/start-dfs.sh,sbin/start-yarn.sh
c) Stop: sbin/stop-all.sh
d) input command JPS can see the relevant information
(1) The following nodes are all started correctly:
Note: Namenode and Secondarynamenode can not be on the same server, above I have done the configuration, the effect is as follows:
6. Test:
A) browser input: 192.168.1.149:8088
b) Input: 192.168.1.149:50070 can see the following datanodes is two, if it is 0 that's not good enough