Installation and preliminary use of the Hadoop 2.7.2 installed on the CentOS7

Last Update:2018-07-20 Source: Internet

Author: User

Tags documentation ssh free ssh hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reference Document http://blog.csdn.net/licongcong_0224/article/details/12972889 Reference document http://www.powerxing.com/install-hadoop/ Reference Document http://www.powerxing.com/install-hadoop-cluster/ Hadoop cluster installation configuration tutorial
Critical: Note that all host names need to be set for specification. You cannot use underscores to make delimiters, only with a minus sign. You cannot use an IP address in the configuration file at the same time. Instead, use the host name.
I was in the process of installation because of these 2 reasons. Causes the cluster to not establish correctly. HDFs Dfsadmin-report, unable to display correct information
Preparation Stage 3 simulator, installation of the CENTOS7 system

Ip	Host Name	User name	Password
192.168.1.151	hadoop-master-001	Hadoop
192.168.1.152	hadoop-slave-001	Hadoop
192.168.1.153	hadoop-slave-002	Hadoop

Add Users

Useradd Hadoop
passwd Hadoop

Disable Transparent Hugepage
View the status of Transparent Hugepage

Cat/sys/kernel/mm/transparent_hugepage/enabled

return results

[Always] madvise never

Permanently closed

Vim/etc/rc.local
Add the following code:
if Test-f/sys/kernel/mm/transparent_hugepage/enabled; then
   echo never >/ sys/kernel/mm/transparent_hugepage/enabled
fi
if test-f/sys/kernel/mm/transparent_hugepage/defrag; then
   echo never >/sys/kernel/mm/transparent_hugepage/defrag
fi

Restarting the machine
View status

Cat/sys/kernel/mm/transparent_hugepage/enabled

return results

Always madvise [never]

System Software Installation Java installation
Installing Oracle's Java SDK

Download Link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
wget http://download.oracle.com/otn-pub/java/jdk/8u72-b15/jdk-8u72-linux-x64.rpm? Authparam=1453706601_fb0540fefe22922be611d401fbbf4d75

After downloading to local, install via Yum
Yum Localinstall jdk-8u72-linux-x64.rpm

Verifying the Java version
Java--version

Setting Java_home Environment variables
Vim/etc/profile
Export Java_home= "/usr/java/default"

Hosts are set up on both master and slave servers

192.168.1.151 hadoop-master-001
192.168.1.152 hadoop-slave-001
192.168.1.153 hadoop-slave-002

Get through 001 to 001, 002,003 SSH without password login
Install SSH on the 001 machine to enter the HDFs user's. SSH directory using ssh-keygen-t RSA to generate the public and private keys (continuous carriage return, no password set) Copy the public key file to the user directory of the HDFs of the machine to be accessed. SSH Directory
SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-master-001:/home/hadoop/.ssh/authorized_keys
SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-slave-001:/home/hadoop/.ssh/authorized_keys SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-slave-001:/home/hadoop/.ssh/authorized_keys
Detect if you can log in without requiring a password
SSH localhost
SSH hadoop@hadoop-master-001
SSH hadoop@hadoop-slave-001
SSH hadoop@hadoop-slave-001
Only 001 is master, and if you have more than one namenode, or RM, you need to get through all master to the rest of the rest of the node password-free login. (Append 001 Authorized_keys to Authorized_keys of 002 and 003)

Note: Under Centos7 to get through the password-free login process of local ssh localhost, if using the online
Cat./id_rsa.pub >>./authorized_keys This method is not password-free. What reason is unknown. Now with
SCP ~/.ssh/id_rsa.pub Hdfs@hadoop-master-001:/home/hdfs/.ssh/authorized_keys
This way to complete the installation of native password-free ssh hadoop hadoop download http://hadoop.apache.org/releases.html. Download page Access http://apache.fayea.com/hadoop/common/
wget http://apache.fayea.com/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz TAR-XZVF hadoop-2.7.2.tar.gz Move the hadoop-2.7.2 to the/usr/local directory under sudo cd/usr/local sudo tar-xzvf hadoop-2.7.2.tar.gz sudo chown-r hadoop:hadoop./hadoop -2.7.2
Ln-s/usr/local/hadoop-2.7.2/usr/local/hadoop
Determine the version of Hadoop
/usr/local/hadoop/bin/hadoop Version Configuration Environment variables

Vim/etc/profile

Export path= $PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin

Hadoop configuration Process

Before configuring, you need to create the following folder on the 001 local file system:

/home/hadoop/name

/home/hadoop/data

/home/hadoop/temp

There are 7 configuration files to be covered here:

~/hadoop-2.7.2/etc/hadoop/hadoop-env.sh

~/hadoop-2.7.2/etc/hadoop/yarn-env.sh

~/hadoop-2.7.2/etc/hadoop/slaves

~/hadoop-2.7.2/etc/hadoop/core-site.xml

~/hadoop-2.7.2/etc/hadoop/hdfs-site.xml

~/hadoop-2.7.2/etc/hadoop/mapred-site.xml

~/hadoop-2.7.2/etc/hadoop/yarn-site.xml

Modify the configuration (how the system has already set the Java_home, also configure the env.sh)
Enter/home/hdfs/hadoop-2.7.2/etc/hadoop on the 192.168.1.151 server
Configuration file 1:hadoop-env.sh

Modify Java_home value (export Java_home=/usr/java/default)

Configuration file 2:yarn-env.sh

Modify Java_home value (export Java_home=/usr/java/default)

Configuration file 3:slaves (this file holds all slave nodes)

Write the following content:

hadoop-slave-001
hadoop-slave-002

Configuration file 4:core-site.xml

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs:// hadoop-master-001:9000</value>
        </property>
        <property>
                <name>hadoop.tmp.dir </name>
                <value>file:/home/hadoop/temp</value>
                <description>abase for other Temporary directories.</description>
        </property>
</configuration>

Configuration file 5:hdfs-site.xml

<configuration>
         <property>
               <name>dfs.namenode.secondary.http-address</name>
               <value>hadoop-master-001:9001</value>
         </property>
         <property>
                 <name >dfs.namenode.name.dir</name>
                 <value>file:/home/hadoop/name</value>
         </property >
         <property>
                 <name>dfs.datanode.data.dir</name>
                 <value>file:/home/ hadoop/data</value>
         </property>
         <property>
                 <name>dfs.replication</ name>
                 <value>3</value>
         </property>
         <property>
                 <name> dfs.webhdfs.enabled</name>
                 <value>true</value>
         </property>
</ Configuration>

Configuration file 6:mapred-site.xml

<configuration>
                <property>
                         <name>mapreduce.framework.name</name>
                         <value >yarn</value>
                </property>
                <property>
                         <name>mapreduce.jobhistory.address </name>
                         <value>hadoop_master_001:10020</value>
                </property>
                <property >
                         <name>mapreduce.jobhistory.webapp.address</name>
                         <value>hadoop_master_ 001:19888</value>
                </property>
</configuration>

Configuration file 7:yarn-site.xml

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> & Lt;name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoo p.mapred.shufflehandler</value> </property> <property> &LT;NAME&GT;YARN.R Esourcemanager.address</name> <value>hadoop_master_001:8032</value> </property&
       Gt <property> <name>yarn.resourcemanager.scheduler.address</name> &LT;VALUE&G t;hadoop_master_001:8030</value> </property> <property> <name>yarn.re
       Sourcemanager.resource-tracker.address</name> <value>hadoop_master_001:8031</value> </properTy> <property> <name>yarn.resourcemanager.admin.address</name> & lt;value>hadoop_master_001:8033</value> </property> <property> <name&
       Gt;yarn.resourcemanager.webapp.address</name> <value>hadoop_master_001:8088</value> </property> </configuration>

Replicating to Hadoop to other nodes
Scp-r/usr/local/hadoop hadoop@hadoop-slave-001:/usr/local/hadoop-2.7.2
Scp-r/usr/local/hadoop hadoop@hadoop-slave-002:/usr/local/hadoop-2.7.2
Start Hadoop into the installation directory: Cd/usr/local/hadoop formatted Namenode:./bin/hdfs Namenode-format (only one minus sign in front of format) if you succeed, you will see "successfully fo Rmatted "and" exitting with status 0″, the exitting with status 1″ is an error. Start HDFs:./sbin/start-dfs.sh

001 The processes running above are: NameNode Secondarynamenode

The processes running above 002 and 003 are: DataNode

Start yarn:./sbin/start-yarn.sh

YARN is isolated from MapReduce and is responsible for resource management and task scheduling. YARN runs on MapReduce, providing high availability, high scalability

001 The processes running above are: NameNode secondarynamenode ResourceManager
The processes running above 002 and 003 are: DataNode NodeManager

After the boot is complete, the command JPS can be used to determine whether the startup is successful, and if successful, the following processes are listed: "NameNode", "DataNode", and "Secondarynamenode" to detect running cases
View cluster status:./bin/hdfs dfsadmin-report
Appear
Live Datanodes (2):

This information indicates that the cluster was established successfully
After successful startup, you can access the Web interface http://192.168.1.151:50070 View NameNode and Datanode information, and you can view the files in HDFS online.
Start YARN to see how tasks work through the Web interface: Http://192.168.1.151:8088/cluster command to manipulate HDFs

Hadoop FS

This command lists all the help interfaces for the sub-commands of HDFs. Basically the syntax is similar to the file operation on Linux

Example: Copying a local file to the HDFS system

Hadoop fs-copyfromlocal *.log Hdfs://192.168.1.151:9000/data/weblogs

Command details, official documentation
Https://hadoop.apache.org/docs/r1.0.4/cn/hdfs_shell.html

Install the Hadoop Ambari 2.2.0 official Installation documentation 2.2.0
Https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.0+from+Public+Repositories
Host Preparation

Host Name	IP Address
Hadoop-ambari	192.168.1.150

Updating the repository for Yum
Yum Update
Cd/etc/yum.repos.d/wget Http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.2.0.0/ambari.repo
Install your own JDK environment
Yum Localinstall jdk-8u72-linux-x64.rpm
Configuring Java_home Environment variables
Vim/etc/profile
Export Java_home= "/usr/java/default"
Installing Ambari
Yum Install Ambari-server
Ambari-server Setup
(select your own installed JDK environment during configuration) to start the service
Ambari-server start
Access to services
http://192.168.1.150:8080 (the default user name and password are: admin/admin)
MapReduce's program developed in MAVEN's Pom.xml file to configure the build of Hadoop

 <!--Hadoop configuration-<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-mapreduce-client-core</artifactId> <scope>provided</scope> < version>2.7.2</version> </dependency> <dependency> <groupid>org.apache.hadoop</ groupid> <artifactId>hadoop-mapreduce-client-common</artifactId> <scope

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Installation and preliminary use of the Hadoop 2.7.2 installed on the CentOS7

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Installation and preliminary use of the Hadoop 2.7.2 installed on the CentOS7

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support