CDH5.7 Quick Offline Installation tutorial

Source: Internet
Author: User

First, Introduction

CDH is a commercial product developed by Cloudera Corporation to rapidly deploy and efficiently manage Hadoop and its various components. It is mainly divided into two parts, Cloudera Manager and CDH package respectively. Where Cloudera Manager is responsible for the deployment and management of the cluster. The CDH package includes installation packages for various Hdaoop components, such as Hive, HDFs, Spark, and so on.

Because the lab server cluster realizes the hardware virtualization, we need to rebuild the CDH cluster in the virtual resources. The Cloudera has three mounting options. In the form of online installation, yum installation, and offline installation, I started off with an offline installation, which is also the way most blog tutorials are currently being used. But in this way, every time I go to the last installation of the service, I always make a mistake when I deploy the configuration file, prompting the error:

The reason for guessing is the authority issue, but the search method on the internet has not been solved. Here, spit a notch. Cloudera official community, few people. As a result, the installation is abandoned for online installation, but online installation takes a lot of time to download the package, but we can manually download the installation, which can greatly improve the installation speed.

Second, the basic Environment Software Environment
1.操作系统:Centos6.52.CDH软件包版本5.6、Cloudra Manager版本5.73.JDK版本oracle jdk1.7.0_67
Hardware environment
9台虚拟机节点,硬件配置如下:

Third, the basic configuration

All of the following actions are performed under root

1.host Configuration
1)修改主机名,vim /etc/sysconfig/network,各台主机honstname改名为对应的名称,service network restart重启网卡生效。2)添加hostname与ip的对应关系如所示:

3)将host从主节点master分发到各个从节点。:scp /etc/hosts [email protected]:/etc
2. Turn off the firewall and SELinux
1)关闭防火墙(每个节点)service iptables stopchkconfig iptables off2)关闭selinux(重启生效)vim /etc/selinux/config

3.SSH Login without password
1)各个节点安装sshssh-keygen -t rsa 一路回车结束2)将公钥加入到authorized_keys(只需master操作)cat id_rsa.pub >authorized_keys3) 修改权限chmod 600 authorized_keys4)将authorized_keys从master分发到各个slavescp authorize_keys [email protected]:~/.ssh/
4.JDK Installation
1)卸载自带javarpm -qa |grep javayum remove java*(删除自带的java)2)安装jdk(每个节点rpm安装)rpm -ivh jdk1.7.0_67.rpm3)配置java环境(每个节点配置,当然可以一个节点配置完了使用scp分发)在/etc/profile中加入:export JAVA_HOME=/usr/java/jdk1.7.0_67export CLASSPATH=.:$CLASSPTAH:$JAVA_HOME/libexport PATH=$PATH:$JAVA_HOME/bin4)使配置生效(每个节点)source /etc/profile
5.NTP Time Synchronization
1)安装NTP(每个节点)yum install ntp2)配置NTPvim /etc/ntp.confmaster配置:(选用复旦大学ntp服务器)

slave配置:(同步master)

3)开启NTP服务service ntpd startchkconfig ntpd off4)查看同步效果命令:ntpstat

Iv. Cloudera Manager installation 1. Download RPM installation Package
rpm安装包:其中jdk如果自己安装好了可以不用下载。

http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.7/RPMS/x86_64/

包含软件:

2.master Node Installation
将下载好的rpm包放到一个文件夹中,任意命名,进入到这个文件夹手动安装:yum localinstall --nogpgcheck *.rpm使用yum安装会同时安装相关的依赖,非常方便如果要卸载使用yum --setopt=tsflags=noscripts remove xxxx
3.slave Node Installation

You do not need to install the server package in slave, just install cloudera-manager-agent.rpm and cloudera-manager-daemons.rpm. First copy the two RPM packets to the slave node, leaving the installation method as Master.

4. Install the Cloduera Manager binary installation package
1)wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin2)chmod u+x cloudera-manager-installer.bin3)./cloudera-manager-installer.bin4)根据安装向导一路next。注意,如果之前master上没有手动安装rpm包此时就会联网下载,下载速度一般都较慢,太费时间。5)安装结束以后会提示登录7180端口。
V. CDH Service installation 1. Making a local parcel
1)下载CDH软件包,:

http://archive.cloudera.com/cdh5/parcels/5.6/

下载对应版本的CDH:下载图中标红的三个资源(el6代表centos6)

之前完成CM安装之后master节点会在/opt目录下生成cloudera文件夹,将刚才下载的三个文件移动到parcel-repo文件夹中并将CDH-5.6.0-1.cdh5.6.0.p0.45-el6.parcel.sha1更名为CDH-5.6.0-1.cdh5.6.0.p0.45-el6.parcel.sha如不更名会在线重新下载。
2. Configure the Software
1)登录7180端口:http://master:7180

初始用户名与密码均为admin2)同意协议一路continue

输入集群中各个主机名或者ip,可以用空格分隔,点击search,然后continue

3) Select Parcel version, as we download the CDH5.6, so choose CDH5.6. There may not be a CDH5.6 option here, since we cloudera-server the CDH5.6 three files into the Parcel-repo folder after the launch, restart Cloudera-scm-server:
/etc/init.d/cloudera-scm-server restart

4)安装jdk,我们之前已经在每个节点都安装了jdk,所以这步可以跳过。5)设置ssh登录,选择全部主机使用统一ssh密码,输入密码点击continue。

6) Install cloudera-manager-agent related software. As we have previously installed the corresponding RPM packages on each node, this will be done soon. If the previous nodes were not manually installed, this step will download the RPM package online, very slowly, and if the error is interrupted. It is strongly recommended that you do not use online downloads.
About 10 minutes to complete the installation (there are only 7 nodes, when the first installation I installed only 7 nodes, the remaining two nodes were added to the cluster later), and then continue to continue.

7)主机检测:

会提示错误,解决办法echo 0 >/proc/sys/vm/swappiness(临时生效)echo never >/sys/kernel/mm/redhat_transparent_hugepage/defrag(临时生效)重启永久生效:编辑vim /etc/sysctl.conf

编辑vim /etc/rc.local

8)安装parcel包    接下来CM安装parcel包,图中提示host is in bad health,这个可以忽略,多等一会就会恢复正常。

9)安装服务如没有特殊需求可以默认

10)一路continue完成安装

At this point, the CDH cluster was completed, and spent two weekends to complete, the process encountered a number of pits. I feel that most of the current blog tutorials are earlier, especially for the method of offline installation I repeatedly installed several times the same problem, and finally in a constant attempt to summarize this is a convenient and fast installation method, I hope to be helpful.

CDH5.7 Quick Offline Installation tutorial

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.