Hadoop Foundation----Hadoop Combat (vii)-----HADOOP management Tools---Install Hadoop---Cloudera Manager and CDH5.8 offline installation using Cloudera Manager

Source: Internet
Author: User
Tags centos iptables server memory hadoop ecosystem firewall

Hadoop Foundation----Hadoop Combat (vi)-----HADOOP management Tools---Cloudera Manager---CDH introduction

We have already learned about CDH in the last article, we will install CDH5.8 for the following study. CDH5.8 is now a relatively new version of Hadoop with more than hadoop2.0, and it already contains a number of components and components in the Hadoop ecosystem that we need to learn next.


Cloudera Manager is designed to simplify the deployment of Hadoop and its components, but because of the large number of components involved, memory requirements are larger. So we try to keep the primary node master and the secondary node sufficient memory and disk space, otherwise there will be some unpredictable errors.

Official Recommended configuration

Primary node memory 10G or more disk 30G

Secondary node memory 4G or more disk 30G

If you do not meet the conditions, you can properly reduce the configuration, but there is no guarantee that the installation will succeed.

We are still using VM VMS to create three virtual machines to deploy.

Because the conditions restrict my virtual machine to be created as follows:

PC System WIN10 (this random, should not affect)

PC Memory 16G

Virtual Machine Vmware-workstation11

Virtual machine system centos-6.4-x86_64 that is CentOS6.4 version 64-bit Linux system

Virtual machine Cloudera SCM Server memory 8G disk 30G as Master's primary node

Virtual Machine 1 Cloudera SCM Agent Memory 2G disk 30G as a secondary node of the SLAVE1

Virtual Machine 2 Cloudera SCM Agent Memory 2G disk 30G as a secondary node of the Slave2

Cloudera Manager 5.8.2

CDH 5.8.2

JDK 1.8

Mysql 5.6.34

Select Installation Method

First use of Cloudera-manager-installer.bin online installation
The second type of online installation using RPM, yum, apt-get
The third use is the Tarballs way to install offline

We talked about Cloudera. There are three kinds of installation methods, the first is the most convenient, just like installing a client software can be, easy to operate.

But the first type of the second is the online installation, that is, the need for network speed, virtual machines need to connect the external network, especially some of the resources are wall, so online installation will be very slow ... and installation failure installation is very likely.

So we still use the third way to install, offline installation of virtual machines do not need to connect the external network, can achieve full offline installation, but requires three virtual machines and PC function to ping each other.

I am now the situation is the PC function Sisu Network, three virtual machines can not sisu network (IP restrictions), so the use of completely offline installation.

Download Related Packages

Because the download may take a bit of time, so we first put all the things we need to download, you can download the side of the installation configuration of the Linux system, until the end of the installation section to use the following installation package.

Oracle version of JDK

Requires Oracle's java1.7 and above JDK

Download Address


I have chosen a rpm here.

MySQL offline installation package

http://dev.mysql.com/downloads/mysql/Open URL: Select Platform: Choose Linux-generic
Select Linux-generic (glibc 2.5) (x86, 64-bit), RPM for download

I download here 5.6.34 version, if the same as I downloaded, you can use the link



Hive/oozie/hue and so on will be used to MySQL, so installation of MySQL is necessary. The JDBC driver is used to connect to MySQL.

We can go to MySQL's official website to download a version can be, I choose here 5.1.40, the version here is arbitrary, do not need to match with what the corresponding:





Cloudera Manager installation package

Resource Links


You can choose the appropriate version to download according to the Linux system you use.

Since the system of our virtual machine is CentOS6.4, we need to download the following files:

Cloudera Manager 5.8.2 installation package


CDH Installation Package

Resource Link http://archive.cloudera.com/cdh5/

Here the version needs to correspond with the system CentOS 6.x used CDH version for Cdh-x.x.x-1.cdhx.x.x.p0.22-el6.parcel, while CentOS 5. The CDH version used by X is cdh-x.x.x-1.cdhx.x.x.p0.22-el5.parcel.

Also note that the version of CDH needs to be equal to or less than the CM version, because our cm is 5.8.2 version, so choose a version less than or equal to 5.8.2.

We are the CentOS system, select the Parcels folder.

Select the installation package according to your own system and environment version.

I'm here to install the package with CDH5.8.0.

Note that the CDH5.8.2 here did not find cdh5.8.2-0 version, cdh5.8.2-1 version is greater than CMCDH5.8.2, so I use CDH5.8.0 here.




After downloading the completed file

System Installation

If it is a real machine or Alibaba cloud server can omit this step, here we are learning environment, so we need to build three virtual machines with VM11.

Detailed steps to view:

(Note Select minimal Desktop when selecting a virtual machine type, if you choose minimal Minimal installation will not have a graphical interface)

Hadoop Foundation------virtual Machines (ii)---Virtual machine installation and installation of Linux systems

We can first create a 2G disk 30G virtual machine and then clone the other 2.

Then change one of the memory to 8G as the host Master-cloudera SCM server.

Now we have 3 virtual machines.

Configured as follows

CM0 Memory 8G Disk 30G

CM1 Memory 2G Disk 30G

CM2 Memory 2G Disk 30G

I am installing all the Component Services here, after the installation of the disk and memory situation is as follows, so the memory is too small and disk space is not enough to really not ...

Installing VMware Tools

If it is a real machine, you can ignore this step.

The VMware Tools tool can be used to copy and paste between virtual machines and PC hosts, otherwise the command needs to be all hand-punched and will be more inconvenient.

Detailed steps for installing VMware Tools can be found in:

Hadoop basic-------Virtual Machine (iii)-----VMware virtual Machine Linux system with Windows Host implementation copy and paste


Hadoop Foundation------virtual Machines (iv)-----The graphical interface and command line text interface switching of Linux systems under VMware virtual machines

Network Configuration

In order to ping each other between virtual machines, we need to configure the virtual machine's network according to the host of the PC. If it is a real machine, you also need to configure and test whether you can ping each other.

Because we have already studied before, we can follow this article detailed operation:

Note: The VM's network card number is automatically incremented because the virtual machine clone is generated, that is, the original VM is eth0, the clone is Eth1,eth2 and so on.

So the network card number in the command should correspond

So device=ethx need to correspond here.

Linux Basics (10)----Linux Network configuration detailed steps---Bridging mode and remote communication for two machines

If you are unfamiliar with the virtual machine network, you can also refer to reading:

Hadoop Foundation-------virtual Machines (v)-----Three modes of network configuration for virtual machine Linux systems

PS: All set up after the discovery of physical and virtual machines can ping 192.168.x.1 and physical functions ping the virtual machine, but the virtual machine ping does not pass the physical machine, is generally a firewall problem.

Problems that you may encounter

Problems encountered----The ETH0 network in the Linux system is gone-reboot does not load Ifcfg-eth0 configuration-requires reactivation

After the configuration is complete, use Ifconfig to view the network situation as follows:

shutting down the firewall

Firewalls and SELinux for both physical and virtual machines need to be shut down

The firewall and SELinux need to be turned off during the installation process, otherwise it will be abnormal.

Use the Getenforce command to see if SELinux is off

Modify the/etc/selinux/config file
Change the selinux=enforcing to selinux=disabled and restart the machine after executing the command

Service iptables Status View firewall state

Chkconfig iptables off

Modify host name and Hosts file Modify host name Command


View Host name commands


The changes are as follows (after reboot)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.