CentOS Hadoop Installation configuration details

Source: Internet
Author: User

General idea, prepare master and slave server, configure Primary server can SSH login from server without password, unzip and install JDK, unpack and install Hadoop, configure HDFs, MapReduce and other subordinate relations.

1, the environment, 3 sets of centos7,64 bit, Hadoop2.7.3 need 64 linux,centos7 minimal ISO file only 600M, the operating system more than 10 minutes can be installed to complete,
Master 192.168.6.220
Slave1 192.168.6.221
Slave2 192.168.6.222

2, SSH password-free login, because Hadoop needs SSH login to the various nodes to operate, I use the root user, each server generates a public key, and then merge to Authorized_keys
(1) CentOS default does not start SSH no secret login, Remove/etc/ssh/sshd_config 2 lines of comments, each server must be set,
#RSAAuthentication Yes
#PubkeyAuthentication Yes
(2) to enter the command, ssh-keygen-t RSA, generate key, do not enter the password, always return,/root will generate. SSH folder, each server to be set,
(3) Merge public key to Authorized_keys file, on master server, enter/root/ . SSH directory, combined via SSH command,
Cat id_rsa.pub>> authorized_keys
ssh [email protected] Cat ~/.ssh/id_rsa.pub> > Authorized_keys
SSH [email protected] cat ~/.ssh/id_rsa.pub>> Authorized_keys
(4) Copy the master server's Authorized_keys, known_hosts to the slave server's/root/.ssh directory
(5), ssh [email protected], ssh [email  protected] You don't need to enter a password

3, installation jdk,hadoop2.7.3 need JDK7, because my CentOS is minimized installation, so there is no openjdk, directly unzip the downloaded JDK and configure the variable can be
(1) Download "jdk-7u79-linux-x64.gz" and put it in the/home/java directory
(2) Decompression, input command, TAR-ZXVF jdk-7u79-linux-x64.gz
(3) Edit/etc/profile
Export java_home=/home/java/jdk1.7.0_79
Export classpath=.: $JAVA _home/jre/lib/rt.jar: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export path= $PATH: $JAVA _home/bin
(4) Make configuration effective, enter command, Source/etc/profile
(5) Input command, java-version, complete

4, install Hadoop2.7.3, only in the master server decompression, and then copied to the slave server
(1) Download "hadoop-2.7.3.tar.gz" and put it in the/home/hadoop directory
(2) Decompression, input command, TAR-XZVF hadoop-2.7.3.tar.gz
(3) Create data storage folder in/home/hadoop directory, TMP, HDFS, Hdfs/data, Hdfs/name

5, configure Core-site.xml
<configuration> under the/home/hadoop/hadoop-2.7.3/etc/hadoop directory;
<property>
<name>fs.defaultfs</name>
<value>hdfs://192.168.6.220:9000</value>
</property
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</ Value>
</property>
<property>
<name>io.file.buffer.size</name>
< Value>131702</value>
</property>
</configuration>

6, configure Hdfs-site.xml
<configuration> under the/home/hadoop/hadoop-2.7.3/etc/hadoop directory;
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/ Home/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication< /name>
<value>2</value>
</property>
<property>
<name> Dfs.namenode.secondary.http-address</name>
<value>192.168.6.220:9001</value>
</ Property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</ Value>
</property>
</configuration>

7. Configure the Mapred-site.xml under the/home/hadoop/hadoop-2.7.3/etc/hadoop directory
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.6.220:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.6.220:19888</value>
</property>
</configuration>


8. Configure the Yarn-site.xml under the/home/hadoop/hadoop-2.7.3/etc/hadoop directory
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.6.220:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.6.220:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.6.220:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>192.168.6.220:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.6.220:8088</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>768</value>
</property>
</configuration>

9, the configuration/home/hadoop/hadoop-2.7.3/etc/hadoop directory under the hadoop-env.sh, yarn-env.sh Java_home, does not set the words, cannot start,
Export java_home=/home/java/jdk1.7.0_79

10, configure the slaves in the/home/hadoop/hadoop-2.7.3/etc/hadoop directory, delete the default localhost, add 2 slave nodes,
192.168.6.221
192.168.6.222

11, the configuration of Hadoop replication to the respective node location, through the SCP transmission,
Scp-r/home/hadoop 192.168.6.221:/home/
Scp-r/home/hadoop 192.168.6.222:/home/

12, start Hadoop on master server, start from node automatically, enter/home/hadoop/hadoop-2.7.3 directory
(1) initialization, input command, Bin/hdfs Namenode-format
(2) All start sbin/start-all.sh, can also separate sbin/start-dfs.sh, sbin/start-yarn.sh
(3) Stop word, enter command, sbin/stop-all.sh
(4) Input command, JPS, can see the relevant information

13, Web Access, to open the port first or directly shut down the firewall
(1) Input command, Systemctl stop Firewalld.service
(2) Browser open http://192.168.6.220:8088/
(3) Browser Open http://192.168.6.220:50070/

14, the installation is complete. This is only the beginning of the big Data application, after the work is, in combination with their own situation, write programs call Hadoop interface, play the role of HDFs, MapReduce.

CentOS Hadoop Installation configuration details

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.