Cloudera Hadoop CDH 5.2.3 and Cent OS 6.6 are used for this installation. The installation process is very unprofessional. For record only, do not refer.
I. Preparations before installation
1. Update the system
Yum update
2. Install JDK
A. Download and install the RPM Package
Cd/usr/local/src
Wget -- no-cookies -- no-check-certificate -- header "Cookie: gpw_e24 = http % 3A % 2F % 2Fwww.oracle.com % 2F; using lelicense = accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jdk-7u75-linux-x64.rpm"
Rpm-ivh jdk-7u75-linux-x64.rpm
Note: Since Oracle has Cookie verification, you cannot directly use the wget http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jdk-7u75-linux-x64.rpm to directly download it, you need to use the aforementioned Cookie simulation method.
Note: Do not use JDK 1.8, which may cause compatibility issues.
B. Configure environment variables
Create a soft link (to facilitate later SDK upgrade)
Ln-s/usr/java/jdk1.7.0 _ 75/usr/java/latest
Add Environment variables
Vim/etc/profile
Append the following information under the profile file:
Export JAVA_HOME =/usr/java/latest
Export CLASSPATH =.: $ JAVA_HOME/jre/lib/rt. jar: $ JAVA_HOME/lib/dt. jar: $ JAVA_HOME/lib/tools. jar
Export PATH = $ PATH: $ JAVA_HOME/bin
Save and exit. Execute:
Source/etc/profile
3. Sort out the installation content
We have prepared three virtual machines with IP addresses:
192.168.150.136
192.168.150.htm
192.168.150.138
4. System configuration
A. Disable IPV6.
Vim/etc/sysctl. conf
Append the following content to the file:
# Disable ipv6
Net. ipv6.conf. all. disable_ipv6 = 1
Net. ipv6.conf. default. disable_ipv6 = 1
Net. ipv6.conf. lo. disable_ipv6 = 1
Refresh the configuration file to make it take effect
Sysctl-p
Check whether ipv6 is disabled
Cat/proc/sys/net/ipv6/conf/all/disable_ipv6
B. Disable the firewall.
Setenforce 0 # temporarily disabled, no need to restart
Iptables-F # clear iptables
Vim/etc/sysconfig/selinux # modify SELINUX = disabled
Chkconfig iptables off # permanent failure after restart
Check whether the firewall is disabled:
/Etc/init. d/iptables status
Chkconfig -- list
You can see that ip6tables is still on. Run the following command:
Chkconfig ip6tables off
C. hostname settings
Vim/etc/sysconfig/network
Change HOSTNAME = localhost. localdomain in the file to HOSTNAME = h1.hadoop, and so on. Run the hostname command to check whether the settings have been updated. The returned result is localhost. localdomain.
[Root @ localhost qw] # hostname
Localhost. localdomain
The solution is to use the hostname command again:
Hostname h1.hadoop
D. Modification of hosts
Vim/etc/hosts
192.168.150.136 h1.hadoop
192.168.150.w.h2.hadoop
192.168.150.138 h3.hadoop
E. Clock synchronization
Select the h1.hadoop node as the clock synchronization server and the other nodes as the synchronization time from the client to the node. Before setting clock synchronization, you must set a time zone. Check whether the machine's time zone is correct:
Date-R
If it is not "+ 8000", you need to modify the time zone,
Cp/usr/share/zoneinfo/Asia/Shanghai/etc/localtime
Install ntp:
Yum install ntp
Modify the configuration file/etc/ntp. conf on h1.hadoop
Vim/etc/ntp. conf
The modified content is as follows:
# Restrict default kod nomodify notrap nopeer noquery
# Restrict-6 default kod nomodify notrap nopeer noquery
Restrict default nomodify
# Restrict 192.168.1.0 mask limit 255.0 nomodify notrap
Restrict 192.168.150.0 mask 255.255.255.0 nomodify notrap
# Server 0.centos.pool.ntp.org iburst
# Server 1.centos.pool.ntp.org iburst
# Server 2.centos.pool.ntp.org iburst
# Server 3.centos.pool.ntp.org iburst
Server 127.127.1.0
Fudge 127.127.1.0 stratum 10
Start ntp:
Service ntpd start
Set boot start:
Chkconfig ntpd on
Client settings (set the hourly synchronization time)
Vim/etc/crontab
Add the following content:
# Example of job definition:
#. ---------------- Minute (0-59)
# |. ----------- Hour (0-23)
# |. ---------- Day of month (1-31)
# |. ------- Month (1-12) OR jan, feb, mar, apr...
# |. ---- Day of week (0-6) (Sunday = 0 or 7) OR sun, mon, tue, wed, thu, fri, sat
# |
# ***** User-name command to be executed
1 * root ntpdate h1.hadoop & hwclock-w
F. SSH password-less authentication configuration
Create a hadoop user to use a proprietary user for related operations
Groupadd hadoop
Useradd-g hadoop
Passwd hadoop
Because the Hadoop running process requires remote management of the Hadoop Daemon, the NameNode node needs to connect to each DataNode node through SSH (Secure Shell) to stop or start their processes, therefore, SSH must have no password. Therefore, we need to configure the NameNode node and the DataNode node for password-free communication. Similarly, DataNode also needs to configure the NameNode node with a password-free link. Configure on each machine:
Configure on each machine:
Vim/etc/ssh/sshd_config
Modify the following content:
RSAAuthentication yes # enable RSA authentication,
PubkeyAuthentication yes # enable public key/private key pair authentication
Add RSA authentication to machines every day:
Su hadoop
Ssh-keygen-t rsa-p''
H1.hadoop
Cat ~ /. Ssh/id_rsa.pub> ~ /. Ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h2.hadoop:/home/hadoop/. ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h3.hadoop:/home/hadoop/. ssh/authorized_keys
H2.hadoop operations
Cat ~ /. Ssh/id_rsa.pub> ~ /. Ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h1.hadoop:/home/hadoop/. ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h3.hadoop:/home/hadoop/. ssh/authorized_keys
Operations on h3.hadoop
Cat ~ /. Ssh/id_rsa.pub> ~ /. Ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h1.hadoop:/home/hadoop/. ssh/authorized_keys
Scp/home/hadoop/. ssh/authorized_keys h2.hadoop:/home/hadoop/. ssh/authorized_keys
Run the following command on each server:
Chmod 400 ~ /. Ssh/authorized_keys
Test
Ssh h2.hadoop
G. Build a local Yum source
Start a new machine and set up the Tegine environment. Perform the following settings:
Vim/usr/local/nginx/conf/nginx. conf
Location /{
Root/usr/local/nginx/html; // specify the absolute path of the actual directory;
Autoindex on; // enable the tengine directory browsing function by setting
Autoindex_exact_size off;
Autoindex_localtime on;
}
Restart service
Service nginx restart
Download the corresponding source:
Cd/usr/local/nginx/html
Wget http://archive.cloudera.com/cdh5/repo-as-tarball/5.3.2/cdh5.3.2-centos6.tar.gz
Wget http://archive-primary.cloudera.com/cm5/repo-as-tarball/5.3.2/cm5.3.2-centos6.tar.gz
Tar zxvf cdh5.3.2-centos6.tar.gz
Open http: // 192.168.150.128/cdh/to view the extracted content.
The local source method is very simple:
Vim/etc/yum. repos. d/cloudera-cdh5.repo
Add the following content:
Cloudera-cdh5
# Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 6 x86_64
Name = Cloudera's Distribution for Hadoop, Version 5
Baseurl = http: // 192.168.150.128/cdh/5.3.2/
Enabled = 1
Gpgcheck = 0
After adding the package, you can use yum install xxx to install it ~