Hadoop Foundation----Hadoop Combat (vi)-----HADOOP management Tools---Cloudera Manager---CDH introduction
We have already learned about CDH in the last article, we will install CDH5.8 for the following study. CDH5.8 is now a relatively new version of Hadoop with more than hadoop2.0, and it already contains a number of components and components in the Hadoop ecosystem that we need to learn next.
Environment
Cloudera Manager is designed to simplify the deployment of Hadoop and its components, but because of the large number of components involved, memory requirements are larger. So we try to keep the primary node master and the secondary node sufficient memory and disk space, otherwise there will be some unpredictable errors.
Official Recommended configuration
Primary node memory 10G or more disk 30G
Secondary node memory 4G or more disk 30G
If you do not meet the conditions, you can properly reduce the configuration, but there is no guarantee that the installation will succeed.
We are still using VM VMS to create three virtual machines to deploy.
Because the conditions restrict my virtual machine to be created as follows:
PC System WIN10 (this random, should not affect)
PC Memory 16G
Virtual Machine Vmware-workstation11
Virtual machine system centos-6.4-x86_64 that is CentOS6.4 version 64-bit Linux system
Virtual machine Cloudera SCM Server memory 8G disk 30G as Master's primary node
Virtual Machine 1 Cloudera SCM Agent Memory 2G disk 30G as a secondary node of the SLAVE1
Virtual Machine 2 Cloudera SCM Agent Memory 2G disk 30G as a secondary node of the Slave2
Cloudera Manager 5.8.2
CDH 5.8.2
JDK 1.8
Mysql 5.6.34
Select Installation Method
First use of Cloudera-manager-installer.bin online installation
The second type of online installation using RPM, yum, apt-get
The third use is the Tarballs way to install offline
We talked about Cloudera. There are three kinds of installation methods, the first is the most convenient, just like installing a client software can be, easy to operate.
But the first type of the second is the online installation, that is, the need for network speed, virtual machines need to connect the external network, especially some of the resources are wall, so online installation will be very slow ... and installation failure installation is very likely.
So we still use the third way to install, offline installation of virtual machines do not need to connect the external network, can achieve full offline installation, but requires three virtual machines and PC function to ping each other.
I am now the situation is the PC function Sisu Network, three virtual machines can not sisu network (IP restrictions), so the use of completely offline installation.
Download Related Packages
Because the download may take a bit of time, so we first put all the things we need to download, you can download the side of the installation configuration of the Linux system, until the end of the installation section to use the following installation package.
Oracle version of JDK
Requires Oracle's java1.7 and above JDK
Download Address
Http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
I have chosen a rpm here.
MySQL offline installation package
http://dev.mysql.com/downloads/mysql/Open URL: Select Platform: Choose Linux-generic
Select Linux-generic (glibc 2.5) (x86, 64-bit), RPM for download
I download here 5.6.34 version, if the same as I downloaded, you can use the link
Http://cdn.mysql.com//Downloads/MySQL-5.6/MySQL-5.6.34-1.linux_glibc2.5.x86_64.rpm-bundle.tar
JDBC
Hive/oozie/hue and so on will be used to MySQL, so installation of MySQL is necessary. The JDBC driver is used to connect to MySQL.
We can go to MySQL's official website to download a version can be, I choose here 5.1.40, the version here is arbitrary, do not need to match with what the corresponding:
http://download.softagency.net/MySQL/Downloads/Connector-J/
Or
http://dev.mysql.com/downloads/connector/j/
Http://download.softagency.net/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.40.zip
Cloudera Manager installation package
Resource Links
http://archive.cloudera.com/cm5/
You can choose the appropriate version to download according to the Linux system you use.
Since the system of our virtual machine is CentOS6.4, we need to download the following files:
Cloudera Manager 5.8.2 installation package
Http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.8.2_x86_64.tar.gz
CDH Installation Package
Resource Link http://archive.cloudera.com/cdh5/
Here the version needs to correspond with the system CentOS 6.x used CDH version for Cdh-x.x.x-1.cdhx.x.x.p0.22-el6.parcel, while CentOS 5. The CDH version used by X is cdh-x.x.x-1.cdhx.x.x.p0.22-el5.parcel.
Also note that the version of CDH needs to be equal to or less than the CM version, because our cm is 5.8.2 version, so choose a version less than or equal to 5.8.2.
We are the CentOS system, select the Parcels folder.
Select the installation package according to your own system and environment version.
I'm here to install the package with CDH5.8.0.
Note that the CDH5.8.2 here did not find cdh5.8.2-0 version, cdh5.8.2-1 version is greater than CMCDH5.8.2, so I use CDH5.8.0 here.
Http://archive.cloudera.com/cdh5/parcels/5.8.0/CDH-5.8.0-1.cdh5.8.0.p0.42-el6.parcel
Http://archive.cloudera.com/cdh5/parcels/5.8.0/CDH-5.8.0-1.cdh5.8.0.p0.42-el6.parcel.sha1
Http://archive.cloudera.com/cdh5/parcels/5.8.0/manifest.json
After downloading the completed file
System Installation
If it is a real machine or Alibaba cloud server can omit this step, here we are learning environment, so we need to build three virtual machines with VM11.
Detailed steps to view:
(Note Select minimal Desktop when selecting a virtual machine type, if you choose minimal Minimal installation will not have a graphical interface)
Hadoop Foundation------virtual Machines (ii)---Virtual machine installation and installation of Linux systems
We can first create a 2G disk 30G virtual machine and then clone the other 2.
Then change one of the memory to 8G as the host Master-cloudera SCM server.
Now we have 3 virtual machines.
Configured as follows
CM0 Memory 8G Disk 30G
CM1 Memory 2G Disk 30G
CM2 Memory 2G Disk 30G
I am installing all the Component Services here, after the installation of the disk and memory situation is as follows, so the memory is too small and disk space is not enough to really not ...
Installing VMware Tools
If it is a real machine, you can ignore this step.
The VMware Tools tool can be used to copy and paste between virtual machines and PC hosts, otherwise the command needs to be all hand-punched and will be more inconvenient.
Detailed steps for installing VMware Tools can be found in:
Hadoop basic-------Virtual Machine (iii)-----VMware virtual Machine Linux system with Windows Host implementation copy and paste
And
Hadoop Foundation------virtual Machines (iv)-----The graphical interface and command line text interface switching of Linux systems under VMware virtual machines
Network Configuration
In order to ping each other between virtual machines, we need to configure the virtual machine's network according to the host of the PC. If it is a real machine, you also need to configure and test whether you can ping each other.
Because we have already studied before, we can follow this article detailed operation:
Note: The VM's network card number is automatically incremented because the virtual machine clone is generated, that is, the original VM is eth0, the clone is Eth1,eth2 and so on.
So the network card number in the command should correspond
So device=ethx need to correspond here.
Linux Basics (10)----Linux Network configuration detailed steps---Bridging mode and remote communication for two machines
If you are unfamiliar with the virtual machine network, you can also refer to reading:
Hadoop Foundation-------virtual Machines (v)-----Three modes of network configuration for virtual machine Linux systems
PS: All set up after the discovery of physical and virtual machines can ping 192.168.x.1 and physical functions ping the virtual machine, but the virtual machine ping does not pass the physical machine, is generally a firewall problem.
Problems that you may encounter
Problems encountered----The ETH0 network in the Linux system is gone-reboot does not load Ifcfg-eth0 configuration-requires reactivation
After the configuration is complete, use Ifconfig to view the network situation as follows:
shutting down the firewall
Firewalls and SELinux for both physical and virtual machines need to be shut down
The firewall and SELinux need to be turned off during the installation process, otherwise it will be abnormal.
Use the Getenforce command to see if SELinux is off
Modify the/etc/selinux/config file
Change the selinux=enforcing to selinux=disabled and restart the machine after executing the command
Service iptables Status View firewall state
Chkconfig iptables off
Modify host name and Hosts file Modify host name Command
Vi/etc/sysconfig/network
View Host name commands
Hostname
The changes are as follows (after reboot)