The practice of data Warehouse based on Hadoop ecosystem--environment construction (II.)

Source: Internet
Author: User
Tags sha1 unpack free ssh hadoop ecosystem sqoop

Ii. Installing Hadoop and the services it needs
1. CDH Installation Overview

CDH's full name is Cloudera's distribution including Apache Hadoop, a Hadoop distribution version of Cloudera Corporation. There are three ways of installing CDH:
. Path A-Automatic installation via Cloudera Manager
. Path B-Installation using Cloudera Manager parcels or packages
. Path C-Manual installation using Cloudera Manager tarballs
The installation steps in different ways are summarized as follows:

Steps

Step 1: Install the JDK

Cloudera Manager Server, Management service, and CDH need to install the JDK.

There are two options:

. Use the Cloudera Manager Setup program to install a supported version of Oracle JDK under/usr/java for all hosts in the cluster.

. Use the command line to install a supported version of the Oracle JDK on all hosts, and set the JAVA_HOME environment variable to the JDK's installation directory.

Step 2: Set up the database

Cloudera Manager Server, Cloudera Management service, and optional services for some CDH require that the database be installed, configured, and started.

There are two options:

. Use the Cloudera Manager installer to install, configure, and launch an inline PostgreSQL database.

. Install, configure, and start the database using a command-line package installation tool such as Yum.

Path A

Path B

Path

Step 3: Install the Cloudera Manager server

Install and start the Cloudera Manager server on a single host.

Install the server using the Cloudera Manager Setup program. Requires sudo access to the host and access to the Internet.

Install the Cloudera Manager server using the Linux package installation commands (such as Yum).

Modify Database Properties.

Use the service command to start the Cloudera Manager server.

Unpack using the Linux command, and start the service with the Services command.

Step 4: Install the Cloudera Manager agent

Install and start the Cloudera Manager agent on all hosts.

Use the Cloudera Manager Setup Wizard to install the agent on all hosts.

There are two options:

. Install the Cloudera Manager agent on all hosts using the Linux Package installation command (such as Yum).

. Use the Cloudera Manager Setup Wizard to install the agent on all hosts.

Use the Linux command to unpack and start the agent on all hosts.

Step 5: Install CDH and services

Install CDH and its services on all hosts.

Use the Cloudera Manager Setup Wizard to install CDH and its services.

There are two options:

. Use the Cloudera Manager Setup Wizard to install CDH and its services.

. Install CDH and its services on all hosts using the Linux package installation commands (such as Yum).

Use the Linux command to unpack on all hosts and use the service command to start CDH and its services.

Step 6: Establish, configure, and start CDH and services

Configure and start CDH and its services on all hosts.

Use the Cloudera Manager Setup Wizard to give the host a role and configure the cluster. Many of the configurations are automatic.

Use the Cloudera Manager Setup Wizard to grant the host a role and configure the cluster. Many of the configurations are automatic.

Use the Cloudera Manager Setup Wizard to give the host a role and configure the cluster. Many of the configurations are automatic. You can also use the Cloudera Manager API to manage a cluster, which is useful for scripting pre-configured deployments.

2. Experimental environment

Host information:

Host Name

IP Address

CDH1

172.16.1.101

CDH2

172.16.1.102

CDH3

172.16.1.103

CDH4

172.16.1.104


Hardware configuration:
Each host: CPU4 core, Memory 8G, HDD 100G

Software version:

Name

Version

Operating system

CentOS Release 6.4 (Final) 64-bit

Jdk

1.7.0_80

Database

MySQL 5.6.14

Jdbc

MySQL Connector Java 5.1.38

Cloudera Manager

5.7.0

CDH

5.7.0


3. Installation Configuration
(1) Pre-installation (all 4 host configurations in the cluster using the root user)
. Download the required installation files from the following address
Http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.7.0_x86_64.tar.gz
Http://archive.cloudera.com/cdh5/parcels/5.7/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel
Http://archive.cloudera.com/cdh5/parcels/5.7/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1
Http://archive.cloudera.com/cdh5/parcels/5.7/manifest.json
    • Use the following command to check the OS dependency package, XXXX Swap package name
Rpm-qa | grep xxxx
The following packages must be installed:
Chkconfig
Python (2.6 required for CDH 5)
Bind-utils
Psmisc
Libxslt
Zlib
Sqlite
Cyrus-sasl-plain
Cyrus-sasl-gssapi
Fuse
Portmap (Rpcbind)
Fuse-libs
Redhat-lsb
    • Configure Domain Name resolution
Vi/etc/hosts
# Add the following 4 lines of content
172.16.1.101 CDH1
172.16.1.102 CDH2
172.16.1.103 CDH3
172.16.1.104 Cdh4
    • Installing the JDK
The JDK version recommended by CDH5 is 1.7.0_67, 1.7.0_75, 1.7.0_80, installed here 1.7.0_80
Note: All hosts will install the same version of the JDK; the installation directory is/usr/java/jdk-version
mkdir/usr/java/
MV jdk-7u80-linux-x64.tar.gz/usr/java/
cd/usr/java/
TAR-ZXVF jdk-7u80-linux-x64.tar.gz
Chown-r Root:root jdk1.7.0_80/
vi/etc/profile.d/java.sh
# Add the following 3 lines of content
Export java_home=/usr/java/jdk1.7.0_80
Export classpath=.: $JAVA _home/jre/lib/*: $JAVA _home/lib/*
Export path= $PATH: $JAVA _home/bin
# Make Environment variables effective
source/etc/profile.d/java.sh
    • Install, configure, and start the NTP service
Yum Install NTP
Chkconfig ntpd on
Ntpdate-u 202.112.29.82
Vi/etc/ntp.conf
# Add the following 8 lines of content
Driftfile/var/lib/ntp/drift
Restrict default Kod nomodify notrap nopeer noquery
restrict-6 default Kod nomodify notrap nopeer noquery
Restrict 127.0.0.1
Restrict-6:: 1
Server 202.112.29.82
Includefile/etc/ntp/crypto/pw
Keys/etc/ntp/keys
# Start the NTP service
Service NTPD Start
    • Build cm Users
Useradd--system--home=/opt/cm-5.7.0/run/cloudera-scm-server--no-create-home--shell=/bin/false--comment " Cloudera SCM User "CLOUDERA-SCM
Usermod-a-G Root CLOUDERA-SCM
echo user=\ "cloudera-scm\" >>/etc/default/cloudera-scm-agent
echo "Defaults Secure_path =/sbin:/bin:/usr/sbin:/usr/bin" >>/etc/sudoers
    • Install the configuration MySQL database (for easy configuration, each host is installed)
RPM-IVH mysql-5.6.14-1.el6.x86_64.rpm
vi/etc/profile.d/mysql.sh
# Add the following 2 lines of content
Export mysql_home=/home/mysql/mysql-5.6.14
Export path= $PATH: $MYSQL _home/bin
# Make Environment variables effective
source/etc/profile.d/mysql.sh
# Change Root password
Mysqladmin-u Root Password
# Edit configuration file
Vi/etc/my.cnf
# contents are as follows
[Mysqld]
Transaction-isolation = read-committed
Log_bin=/data/mysql_binary_log
Binlog_format = Mixed
Innodb_flush_log_at_trx_commit = 2
Innodb_flush_method = O_direct
Key_buffer = 16M
Key_buffer_size = 32M
Max_allowed_packet = 32M
Thread_stack = 256K
Thread_cache_size = 64
Query_cache_limit = 8M
Query_cache_size = 64M
Query_cache_type = 1
Max_connections = 550
Read_buffer_size = 2M
Read_rnd_buffer_size = 16M
Sort_buffer_size = 8M
Join_buffer_size = 8M
Innodb_flush_log_at_trx_commit = 2
Innodb_log_buffer_size = 64M
Innodb_buffer_pool_size = 4G
Innodb_thread_concurrency = 8
Innodb_log_file_size = 512M
[Mysqld_safe]
Log-error=/data/mysqld.err
Pid-file=/data/mysqld.pid
Sql_mode=strict_all_tables

# Add Boot Boot
Chkconfig MySQL on
# start MySQL
Service MySQL Restart
# Build the metabase as needed
Mysql-u root-p-E "CREATE DATABASE hive default CHARACTER set utf8;create database rman default CHARACTER set Utf8;creat E database Oozie DEFAULT CHARACTER SET utf8;grant all on * * to ' root ' @ '% ' identified by ' mypassword '; "
    • Installing the MySQL JDBC driver
TAR-ZXVF mysql-connector-java-5.1.38.tar.gz
CP./mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar/usr/share/java/mysql-connector-java.jar
    • Configure password-free SSH (any two machines configured here are password-free)
# Generate a key pair on four machines, respectively:
CD ~
SSH-KEYGEN-T RSA
# and then return
# performed on CDH1:
CD ~/.ssh/
Ssh-copy-id CDH1
Scp/root/.ssh/authorized_keys cdh2:/root/.ssh/
# performed on CDH2:
CD ~/.ssh/
Ssh-copy-id CDH2
Scp/root/.ssh/authorized_keys cdh3:/root/.ssh/
#在cdh3上执行:
CD ~/.ssh/
Ssh-copy-id CDH3
Scp/root/.ssh/authorized_keys cdh4:/home/grid/.ssh/
#在cdh4上执行:
CD ~/.ssh/
Ssh-copy-id Cdh4
Scp/root/.ssh/authorized_keys cdh1:/root/.ssh/
Scp/root/.ssh/authorized_keys cdh2:/root/.ssh/
Scp/root/.ssh/authorized_keys cdh3:/root/.ssh/

(2) Install Cloudera Manager on CDH1
TAR-XZVF cloudera-manager*.tar.gz-c/opt/
# Build CM Database
/opt/cm-5.7.0/share/cmf/schema/scm_prepare_database.sh MySQL cm-hlocalhost-uroot-pmypassword--scm-host localhost SCM SCM SCM
# Configure CM Proxy
Vi/opt/cm-5.7.0/etc/cloudera-scm-agent/config.ini
# change CM host name to Cdh1
Server_host=cdh1
# Copy parcel related three files to/opt/cloudera/parcel-repo
CP cdh-5.7.0-1.cdh5.7.0.p0.45-el6.parcel/opt/cloudera/parcel-repo/
CP cdh-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1/opt/cloudera/parcel-repo/
CP manifest.json/opt/cloudera/parcel-repo/
# renaming
mv/opt/cloudera/parcel-repo/cdh-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1/opt/cloudera/parcel-repo/ Cdh-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha
# Modify Owner
Chown-r cloudera-scm:cloudera-scm/opt/cloudera/
Chown-r cloudera-scm:cloudera-scm/opt/cm-5.7.0/
# Copy the/opt/cm-5.7.0 directory to three other hosts
Scp-r-p/opt/cm-5.7.0 cdh2:/opt/
Scp-r-p/opt/cm-5.7.0 cdh3:/opt/
Scp-r-p/opt/cm-5.7.0 cdh4:/opt/

(3) Create the/opt/cloudera/parcels directory on each host and modify the owner
Mkdir-p/opt/cloudera/parcels
Chown Cloudera-scm:cloudera-scm/opt/cloudera/parcels

(4) Start the CM server on CDH1
/opt/cm-5.7.0/etc/init.d/cloudera-scm-server start
# This step needs to run some time, see the boot situation with the following command
Tail-f/opt/cm-5.7.0/log/cloudera-scm-server/cloudera-scm-server.log

(5) Start cm agent on all hosts
Mkdir/opt/cm-5.7.0/run/cloudera-scm-agent
Chown cloudera-scm:cloudera-scm/opt/cm-5.7.0/run/cloudera-scm-agent
/opt/cm-5.7.0/etc/init.d/cloudera-scm-agent start

(6) Login cm console, installation configuration CDH5 and its services
Open the console
http://172.16.1.101:7180/
The page looks like this.
The default user name and password are admin, login to enter the Welcome page. Tick the license agreement as shown, and click Continue.
Go to the Release notes page, as shown, and click Continue.
Go to the Service description page, as shown in, click Continue.
Go to the Select Host page, as shown, select all four hosts, point to continue.
Go to the Select Repository page, as shown in, click Continue.
Go to the cluster installation page, as shown in, click Continue.
Go to the Verification page, as shown in, click Finish.
Go to the Cluster Settings page, as shown, select the service as needed, and click Continue.
Go to the Custom Role Assignment page, as shown, and leave the point unchanged.
Go to the Database Settings page, fill in the relevant information, point to test the connection, as shown in, click Continue.
Go to the Audit changes page, keep the same, point continue.

Go to the first run page and wait for the run to finish, as shown, point continues.
Go to the Installation Success page, as shown in, click Finish.
Go to the Installation Success page as shown in.
At this point, the CDH installation is complete, and the host and role correspond as shown in the following table.

Service

Role

Host

Hdfs

DataNode

Cdh1

Cdh3

Cdh4

NameNode

Cdh2

Secondarynamenode

Cdh2

Hive

Hive Metastore Server

Cdh2

HiveServer2

Cdh2

Hue

Hue Server

Cdh2

Impala

Impala Catalog Server

Cdh2

Impala Daemon

Cdh1

Cdh3

Cdh4

Impala Statestore

Cdh2

Oozie

Oozie Server

Cdh2

Sqoop 2

Sqoop 2 Server

Cdh2

YARN

Jobhistory Server

Cdh2

NodeManager

Cdh1

Cdh3

Cdh4

ResourceManager

Cdh2

CDH's official Installation documentation URL address is:

Http://www.cloudera.com/documentation/enterprise/latest/topics/installation.html

The practice of data Warehouse based on Hadoop ecosystem--environment construction (II.)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.