Hadoop cluster installation-CDH5 (three server clusters)
Hadoop cluster installation-CDH5 (three server clusters)
CDH5 package download: http://archive.cloudera.com/cdh5/
Host planning:
IP |
Host |
Deployment module |
Process |
192.168.107.82 |
Hadoop-NN-01 |
NameNode ResourceManager |
NameNode DFSZKFailoverController ResourceManager |
192.168.107.83 |
Hadoop-DN-01 Zookeeper-01 |
DataNode NodeManager Zookeeper |
DataNode NodeManager JournalNode QuorumPeerMain |
192.168.107.84 |
Hadoop-DN-02 Zookeeper-02 |
DataNode NodeManager Zookeeper |
DataNode NodeManager JournalNode QuorumPeerMain |
Process description:
- NameNode
- ResourceManager
- DFSZKFC: DFS Zookeeper Failover Controller activates Standby NameNode
- DataNode
- NodeManager
- JournalNode: NameNode shares the editlog node service (if NFS sharing is used, the process and all startup configurations can be omitted ).
- QuorumPeerMain: Zookeeper MAIN PROCESS
Directory planning:
Name |
Path |
$ HADOOP_HOME |
/Home/hadoopuser/hadoop-2.6.0-cdh5.6.0 |
Data |
$ HADOOP_HOME/data |
Log |
$ HADOOP_HOME/logs |
Configuration:
1. Disable the firewall (the firewall can be configured later)
Ii. Install JDK (omitted)
3. Modify the HostName and configure the Host (3 hosts)
[root@Linux01 ~]# vim /etc/sysconfig/network[root@Linux01 ~]# vim /etc/hosts192.168.107.82 Hadoop-NN-01192.168.107.83 Hadoop-DN-01 Zookeeper-01192.168.107.84 Hadoop-DN-02 Zookeeper-01
4. Create a dedicated Hadoop login user (5) for security purposes)
[Root @ Linux01 ~] # Useradd hadoopuser [root @ Linux01 ~] # Passwd hadoopuser [root @ Linux01 ~] # Su-hadoopuser # Switch users
5. Configure SSH password-free Logon (2 NameNode servers)
[Hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh-keygen # generate a public/private key [hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh-copy-id-I ~ /. Ssh/id_rsa.pub hadoopuser @ Hadoop-NN-01
-I indicates input
~ /. Ssh/id_rsa.pub indicates which public key group
Or omitted:
[Hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh-copy-id Hadoop-NN-01 (or write IP: 10.10.51.231) # throwing the public key to the opposite server [hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh-copy-id "6000 Hadoop-NN-01" # Write this if there is a port
Note: Modify Hadoop's profile Hadoop-env.sh
Export HADOOP_SSH_OPTS = "-p 6000"
[Hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh Hadoop-NN-01 # verify (exit current connection command: exit, logout) [hadoopuser @ Linux05 hadoop-2.6.0-cdh5.6.0] $ ssh Hadoop-NN-01-p 6000 # Write this if it has a port
6. Configure environment variables: vi ~ /. Bashrc and then source ~ /. Bashrc (5)
[Hadoopuser @ Linux01 ~] $ Vi ~ /. Bashrc # hadoop cdh5export HADOOP_HOME =/home/hadoopuser/hadoop-2.6.0-cdh5.6.0export PATH = $ PATH: $ HADOOP_HOME/sbin: $ HADOOP_HOME/bin [hadoopuser @ Linux01 ~] $ Source ~ /. Bashrc # effective
7. Install zookeeper (2 DataNode servers)
1. Extract
2. Configure environment variables: vi ~ /. Bashrc
[Hadoopuser @ Linux01 ~] $ Vi ~ /. Bashrc # zookeeper cdh5export ZOOKEEPER_HOME =/home/hadoopuser/zookeeper-3.4.5-cdh5.6.0export PATH = $ PATH: $ ZOOKEEPER_HOME/bin [hadoopuser @ Linux01 ~] $ Source ~ /. Bashrc # effective
3. Modify log output
[Hadoopuser @ Linux01 ~] $ Vi $ ZOOKEEPER_HOME/libexec/zkEnv. sh56: Find the following statement: ZOO_LOG_DIR = "$ ZOOKEEPER_HOME/logs"
4. modify the configuration file
[hadoopuser@Linux01 ~]$ vi $ZOOKEEPER_HOME/conf/zoo.cfg# zookeepertickTime=2000initLimit=10syncLimit=5dataDir=/home/hadoopuser/zookeeper-3.4.5-cdh5.6.0/dataclientPort=2181# clusterserver.1=Zookeeper-01:2888:3888server.2=Zookeeper-02:2888:3888
5. Set myid
(1) Hadoop-DN-01:
mkdir $ZOOKEEPER_HOME/dataecho 1 > $ZOOKEEPER_HOME/data/myid
(2) Hadoop-DN-02:
mkdir $ZOOKEEPER_HOME/dataecho 2 > $ZOOKEEPER_HOME/data/myid
6. Start each node:
[hadoopuser@Linux01 ~]$ zkServer.sh start
7. Verification
[hadoopuser@Linux01 ~]$ jps3051 Jps2829 QuorumPeerMain
8. Status
[hadoopuser@Linux01 ~]$ zkServer.sh statusJMX enabled by