Reference Document http://blog.csdn.net/licongcong_0224/article/details/12972889 Reference document http://www.powerxing.com/install-hadoop/ Reference Document http://www.powerxing.com/install-hadoop-cluster/
Hadoop cluster installation configuration tutorial
Critical: Note that all host names need to be set for specification. You cannot use underscores to make delimiters, only with a minus sign. You cannot use an IP address in the configuration file at the same time. Instead, use the host name.
I was in the process of installation because of these 2 reasons. Causes the cluster to not establish correctly. HDFs Dfsadmin-report, unable to display correct information
Preparation Stage 3 simulator, installation of the CENTOS7 system
Ip |
Host Name |
User name |
Password |
192.168.1.151 |
hadoop-master-001 |
Hadoop |
|
192.168.1.152 |
hadoop-slave-001 |
Hadoop |
|
192.168.1.153 |
hadoop-slave-002 |
Hadoop |
|
Add Users
Useradd Hadoop
passwd Hadoop
Disable Transparent Hugepage
View the status of Transparent Hugepage
Cat/sys/kernel/mm/transparent_hugepage/enabled
return results
[Always] madvise never
Permanently closed
Vim/etc/rc.local
Add the following code:
if Test-f/sys/kernel/mm/transparent_hugepage/enabled; then
echo never >/ sys/kernel/mm/transparent_hugepage/enabled
fi
if test-f/sys/kernel/mm/transparent_hugepage/defrag; then
echo never >/sys/kernel/mm/transparent_hugepage/defrag
fi
Restarting the machine
View status
Cat/sys/kernel/mm/transparent_hugepage/enabled
return results
Always madvise [never]
System Software Installation Java installation
Installing Oracle's Java SDK
Download Link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
wget http://download.oracle.com/otn-pub/java/jdk/8u72-b15/jdk-8u72-linux-x64.rpm? Authparam=1453706601_fb0540fefe22922be611d401fbbf4d75
After downloading to local, install via Yum
Yum Localinstall jdk-8u72-linux-x64.rpm
Verifying the Java version
Java--version
Setting Java_home Environment variables
Vim/etc/profile
Export Java_home= "/usr/java/default"
Hosts are set up on both master and slave servers
192.168.1.151 hadoop-master-001
192.168.1.152 hadoop-slave-001
192.168.1.153 hadoop-slave-002
Get through 001 to 001, 002,003 SSH without password login
Install SSH on the 001 machine to enter the HDFs user's. SSH directory using ssh-keygen-t RSA to generate the public and private keys (continuous carriage return, no password set) Copy the public key file to the user directory of the HDFs of the machine to be accessed. SSH Directory
SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-master-001:/home/hadoop/.ssh/authorized_keys
SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-slave-001:/home/hadoop/.ssh/authorized_keys SCP ~/.ssh/id_rsa.pub Hadoop@hadoop-slave-001:/home/hadoop/.ssh/authorized_keys
Detect if you can log in without requiring a password
SSH localhost
SSH hadoop@hadoop-master-001
SSH hadoop@hadoop-slave-001
SSH hadoop@hadoop-slave-001
Only 001 is master, and if you have more than one namenode, or RM, you need to get through all master to the rest of the rest of the node password-free login. (Append 001 Authorized_keys to Authorized_keys of 002 and 003)
Note: Under Centos7 to get through the password-free login process of local ssh localhost, if using the online
Cat./id_rsa.pub >>./authorized_keys This method is not password-free. What reason is unknown. Now with
SCP ~/.ssh/id_rsa.pub Hdfs@hadoop-master-001:/home/hdfs/.ssh/authorized_keys
This way to complete the installation of native password-free ssh hadoop hadoop download http://hadoop.apache.org/releases.html. Download page Access http://apache.fayea.com/hadoop/common/
wget http://apache.fayea.com/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz TAR-XZVF hadoop-2.7.2.tar.gz Move the hadoop-2.7.2 to the/usr/local directory under sudo cd/usr/local sudo tar-xzvf hadoop-2.7.2.tar.gz sudo chown-r hadoop:hadoop./hadoop -2.7.2
Ln-s/usr/local/hadoop-2.7.2/usr/local/hadoop
Determine the version of Hadoop
/usr/local/hadoop/bin/hadoop Version Configuration Environment variables
Vim/etc/profile
Export path= $PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin
Hadoop configuration Process
Before configuring, you need to create the following folder on the 001 local file system:
/home/hadoop/name
/home/hadoop/data
/home/hadoop/temp
There are 7 configuration files to be covered here:
~/hadoop-2.7.2/etc/hadoop/hadoop-env.sh
~/hadoop-2.7.2/etc/hadoop/yarn-env.sh
~/hadoop-2.7.2/etc/hadoop/slaves
~/hadoop-2.7.2/etc/hadoop/core-site.xml
~/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
~/hadoop-2.7.2/etc/hadoop/mapred-site.xml
~/hadoop-2.7.2/etc/hadoop/yarn-site.xml
Modify the configuration (how the system has already set the Java_home, also configure the env.sh)
Enter/home/hdfs/hadoop-2.7.2/etc/hadoop on the 192.168.1.151 server
Configuration file 1:hadoop-env.sh
Modify Java_home value (export Java_home=/usr/java/default)
Configuration file 2:yarn-env.sh
Modify Java_home value (export Java_home=/usr/java/default)
Configuration file 3:slaves (this file holds all slave nodes)
Write the following content:
hadoop-slave-001
hadoop-slave-002
Configuration file 4:core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs:// hadoop-master-001:9000</value>
</property>
<property>
<name>hadoop.tmp.dir </name>
<value>file:/home/hadoop/temp</value>
<description>abase for other Temporary directories.</description>
</property>
</configuration>
Configuration file 5:hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-master-001:9001</value>
</property>
<property>
<name >dfs.namenode.name.dir</name>
<value>file:/home/hadoop/name</value>
</property >
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ hadoop/data</value>
</property>
<property>
<name>dfs.replication</ name>
<value>3</value>
</property>
<property>
<name> dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</ Configuration>
Configuration file 6:mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value >yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address </name>
<value>hadoop_master_001:10020</value>
</property>
<property >
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop_master_ 001:19888</value>
</property>
</configuration>
Configuration file 7:yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> & Lt;name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoo p.mapred.shufflehandler</value> </property> <property> <NAME>YARN.R Esourcemanager.address</name> <value>hadoop_master_001:8032</value> </property&
Gt <property> <name>yarn.resourcemanager.scheduler.address</name> <VALUE&G t;hadoop_master_001:8030</value> </property> <property> <name>yarn.re
Sourcemanager.resource-tracker.address</name> <value>hadoop_master_001:8031</value> </properTy> <property> <name>yarn.resourcemanager.admin.address</name> & lt;value>hadoop_master_001:8033</value> </property> <property> <name&
Gt;yarn.resourcemanager.webapp.address</name> <value>hadoop_master_001:8088</value> </property> </configuration>
Replicating to Hadoop to other nodes
Scp-r/usr/local/hadoop hadoop@hadoop-slave-001:/usr/local/hadoop-2.7.2
Scp-r/usr/local/hadoop hadoop@hadoop-slave-002:/usr/local/hadoop-2.7.2
Start Hadoop into the installation directory: Cd/usr/local/hadoop formatted Namenode:./bin/hdfs Namenode-format (only one minus sign in front of format) if you succeed, you will see "successfully fo Rmatted "and" exitting with status 0″, the exitting with status 1″ is an error. Start HDFs:./sbin/start-dfs.sh
001 The processes running above are: NameNode Secondarynamenode
The processes running above 002 and 003 are: DataNode
Start yarn:./sbin/start-yarn.sh
YARN is isolated from MapReduce and is responsible for resource management and task scheduling. YARN runs on MapReduce, providing high availability, high scalability
001 The processes running above are: NameNode secondarynamenode ResourceManager
The processes running above 002 and 003 are: DataNode NodeManager
After the boot is complete, the command JPS can be used to determine whether the startup is successful, and if successful, the following processes are listed: "NameNode", "DataNode", and "Secondarynamenode" to detect running cases
View cluster status:./bin/hdfs dfsadmin-report
Appear
Live Datanodes (2):
This information indicates that the cluster was established successfully
After successful startup, you can access the Web interface http://192.168.1.151:50070 View NameNode and Datanode information, and you can view the files in HDFS online.
Start YARN to see how tasks work through the Web interface: Http://192.168.1.151:8088/cluster command to manipulate HDFs
Hadoop FS
This command lists all the help interfaces for the sub-commands of HDFs. Basically the syntax is similar to the file operation on Linux
Example: Copying a local file to the HDFS system
Hadoop fs-copyfromlocal *.log Hdfs://192.168.1.151:9000/data/weblogs
Command details, official documentation
Https://hadoop.apache.org/docs/r1.0.4/cn/hdfs_shell.html
Install the Hadoop Ambari 2.2.0 official Installation documentation 2.2.0
Https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.0+from+Public+Repositories
Host Preparation
Host Name |
IP Address |
Hadoop-ambari |
192.168.1.150 |
|
|
Updating the repository for Yum
Yum Update
Cd/etc/yum.repos.d/wget Http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.2.0.0/ambari.repo
Install your own JDK environment
Yum Localinstall jdk-8u72-linux-x64.rpm
Configuring Java_home Environment variables
Vim/etc/profile
Export Java_home= "/usr/java/default"
Installing Ambari
Yum Install Ambari-server
Ambari-server Setup
(select your own installed JDK environment during configuration) to start the service
Ambari-server start
Access to services
http://192.168.1.150:8080 (the default user name and password are: admin/admin)
MapReduce's program developed in MAVEN's Pom.xml file to configure the build of Hadoop
<!--Hadoop configuration-<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-mapreduce-client-core</artifactId> <scope>provided</scope> < version>2.7.2</version> </dependency> <dependency> <groupid>org.apache.hadoop</ groupid> <artifactId>hadoop-mapreduce-client-common</artifactId> <scope