CDH5.3.2 installation documentation and troubleshooting

Source: Internet
Author: User
Tags sqoop hadoop fs value store
The OS version is centos6.5, And the 64-bit JDK 1.6 also supports python 2.6 or later versions of JDK 1.7. Before installation, disable serviceiptablesstopchkconfigiptablesoff to disable SELINUXvietcselin.

Version Conventions: operating system version centos6.5, the 64-bit jdk1.6 version also supports jdk1.7 python version 2.6 or 2.7 cluster version cdh5.3.2 cloudera manager 5.3 mysql5.0 or later versions. Before installation, the firewall is disabled by SELINUX vi. /etc/selin

Version conventions

Centos6.5, 64-bit operating system version
Jdk1.6 also supports jdk1.7.

Python version 2.6 or 2.7

Cluster version: cdh5.3.2
Cloudera manager 5.3

MySQL or later versions

CM installation instructions

  • Firewall disabled

Service iptables stop chkconfig iptables off

  • Disable SELINUX

Set selinux = disabled in vi/etc/SELINUX/config

  • Check that port 7180 is not in use

  • Install mysql before installing coudera manager, configure the database configuration file, and create the corresponding database.
  • Change the default storage engine of the mysql database to innodb.

CM Installation Method

The installation method is yum. Because the Intranet machine cannot access the Internet, we must build a local yum warehouse, yum warehouse machine 10.100.3.17.

Build a local YUM Repository

The rpm installation package required for yum installation includes the cloudera manager 5 related installation package: http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.3.2/RPMS/x86_64,

CDH5.3.2 installation package,: http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.3.2/RPMS/x86_64/,http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.3.2/RPMS/noarch/

Upload the relevant installation package to the machine 10.100.3.17/var/ftp/pub/Packages directory, and then execute the createrepo command:

createrepo -g /var/ftp/pub/repodata/repomd.xml /var/ftp/pub/

Configure the local yum source for the cdh Cluster machine:

cd /etc/yum.repos.d/

Rm-rf * # Delete useless configuration yum source file

Vi ftp-server.repo # Add ftp-server.repo file to add the following Configuration:

[base]

name=ftp-server

baseurl=ftp://10.100.3.17/pub/

gpgcheck=0

How to install cdh cluster in Clouera Manager

There are three cdh cluster machines: 10.100.3.95, 10.100.3.96, 10.100.3.97, 10.100.3.98, and 10.100.3.99.

Deploy the cloudera manager agent on these five machines respectively,

Deploy cloudera manager server and mysql on 10.100.3.95

Install jdk

First, check whether openJDK has been installed on the Cluster machine. If yes, uninstall and run the following command:

rpm -qa | grep jdk

Rpm-e xxx # xxx indicates the rpm package name output in the previous step.

Install jdk on all machines, configure JAVA_HOME, and run the following command:

yum install jdk

Vi/etc/profile # Add the following configuration

export JAVA_HOME=/usr/java/jdk1.6.0.31

export PATH=$JAVA_HOME/bin:$JAVA_HOME/lib:$PATH

# Make the configuration take effect

source /etc/profile

Configure NTP service

We need to configure the ntp Time Synchronization for the cluster. After the cluster is installed, Cloudera Manager checks the time synchronization of the cluster. If the cluster is not synchronized, an alarm is triggered.

Bad Health -- Clock Offset

The host's NTP service did not respond to a request for the clock offset.

We use 10.100.3.95 as the master machine. All machines synchronize the time on this machine, and all machines install the ntp service.

yum install ntp

Configure NTP Server on 95 machines, modify the/etc/ntpd. conf file and add the following Configuration:

restrict 0.0.0.0master 0.0.0.0Nomodify notify # enable all network segments to synchronize the Time of the machine

server 127.127.1.0

fudge 127.127.1.0stratum 8

Start the NTP service

/etc/init.d/ntpd start

chkconfig ntpd on

Other machines synchronize the Time of the machine and enable the ntpd service. If the ntpd service is not enabled for other machines, the Cloudera Manager will also issue an alarm, cloudera Manager uses the ntpdc-c loopinfo command to determine the cluster latency. The cluster synchronization Time Command is:

ntpdate 10.100.3.95

# Add commands to crontab

crontab -e

*/15* * * * ntpdate 10.100.3.95

Install Mysql

Cloudera Manager manages service information and cluster configuration information through databases. You can use a built-in PostgreSQL or external database system. Currently, Mysql, Oracle, and external PostgreSQL databases are supported. Here we install the external Mysql database.

$ yum install mysql mysql-devel mysql-server

# Start mysql after installation;

$ service mysqld start

# Configure the mysql database and add the following content. Restart the mysql database. If no error is reported, the configuration is successful;

[mysqld]

transaction-isolation = READ-COMMITTED

# Disabling symbolic-links is recommended to prevent assorted security risks;

# to doso, uncomment thisline:

# symbolic-links = 0

key_buffer = 16M

key_buffer_size = 32M

max_allowed_packet = 32M

thread_stack = 256K

thread_cache_size = 64

query_cache_limit = 8M

query_cache_size = 64M

query_cache_type = 1

max_connections = 550

#log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log'with an appropriate path foryour system and chown the specified folder to the mysql user.

#log_bin=/var/lib/mysql/mysql_binary_log

#expire_logs_days = 10

#max_binlog_size = 100M

# For MySQL version 5.1.8or later. Comment out binlog_format forolder versions.

binlog_format = mixed

read_buffer_size = 2M

read_rnd_buffer_size = 16M

sort_buffer_size = 8M

join_buffer_size = 8M

# InnoDB settings

innodb_file_per_table = 1

innodb_flush_log_at_trx_commit = 2

innodb_log_buffer_size = 64M

innodb_buffer_pool_size = 4G

innodb_thread_concurrency = 8

innodb_flush_method = O_DIRECT

innodb_log_file_size = 512M

[mysqld_safe]

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

Remove these two files:/var/lib/mysql/ib_logfile0;/Var/lib/mysql/ib_logfile1 and restart the mysql service.

Install MySQL JDBC ctor

Driver: Middleware.

$ mkdir -p /usr/share/java/

$ cp mysql-connector-java-5.1.17.jar /usr/share/java/mysql-connector-java.jar

Configure Mysql

Set the password of the root account:

$ sudo /usr/bin/mysql_secure_installation

[...]

Enter current password forroot (enter fornone):

OK, successfully used password, moving on...

[...]

Set root password? [Y/n] y

New password:

Re-enter newpassword:

Remove anonymous users? [Y/n] Y

[...]

Disallow root login remotely? [Y/n] N

[...]

Remove test database and access to it [Y/n] Y

[...]

Reload privilege tables now? [Y/n] Y

All done!

Create a Mysql database

Create a database to save the service configurations of Activity Monitor, Report Manager, Hive MetaStore Server, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.

Log on to the Mysql database as a root user

$ mysql -u root -p

Enter password:

Create a database

mysql> create database database DEFAULT CHARACTER SET utf8;

Query OK, 1row affected (0.00sec)

mysql> grant all on database.* TO 'user'@'%'IDENTIFIED BY 'password';

Query OK, 0rows affected (0.00sec)

The following table lists the databases, users, and passwords.

Activity Monitor Amon Amon Amon_password
Reports Manager Rman Rman Rman_password
Hive Metastore Server Metastore Hive Hive_password
Sentry Server Sentry Sentry Sentry_password
Cloudera Navigator Audit Server Nav Nav Nav_password
Cloudera Navigator Metadata Server Navms Navms Navms_password

Role

Database

User

Password

Install Cloudera Manager Server

$ yum install cloudera-manager-daemons

$ yum install cloudera-manager-server

Configure the Cloudera Manager Server Database

Cloudera Manager Server and Mysql are installed on the same machine and run the following command:

/usr/share/cmf/schema/scm_prepare_database.sh mysql -uroot -p --scm-host localhost scm scm scm

If Successful is displayed, the execution is Successful.

Start the Cloudera Manager Server Service

service cloudera-scm-server start

You do not need to manually install the Cloudera Manager agent and directly go to the CM web interface to automatically install the configuration.

Web login

Url format: http :// :

Open your browser and enter url, http: // 10.100.3.95: 7180

Default User name: admin Password: admin

Install Cloudera Manager Agent and CDH Components

1. Select the free version of Cloudera Express based on the CM boot interface. Click Next to install the specified host for the CDH cluster.

2. Enter the IP address of the machine on which the cluster is to be installed, including the Cloudera Manager Server.

3. select the cluster installation method, select "use data packets", select "Custom" for the CDH version, enter the yum source address ftp: // 10.100.3.17/pub/, and select "Custom" for the Cloudera Manager Agent, enter the yum source address ftp: // 10.100.3.17/pub /. Click continue

4. Cluster installation status. The installation status of each cluster is displayed. If the cluster is normal, proceed to the next step.

5. select the CDH component to be installed, install HBase, HDFS, Hive, Hue, Key-Value Store Indexer, Oozie, Solr, Spark, Sqoop 2, YARN, and Zookeeper services. Click continue

6. CM detects the installation environment and prompts an installation warning: cloudera recommends setting/proc/sys/vm/swappiness to 0 and current to 60, run the following command on each machine in the Cluster:

echo 0> /proc/sys/vm/swappiness

7. Select the role assignment of the Cluster machine. For the default options, you can select the Master (10.100.3.95) machine. Of course, the Second NameNode can be selected on a non-NameNode machine. Note that Cloudera Management Service selects Master (10.100.3.95), that is, the host that installs mysql. Because mysql is not installed on other hosts, click Continue.

8. Database Configuration. Select the corresponding service based on the created data table.

9. Cluster settings. Select the default value to start cluster installation.

Oozie Configuration

After oozie is installed, You need to configure the following:

1. Install the Oozie shared library as follows:

  • Select oozie Service
  • ClickOperation->Stop
  • ClickInstall Oozie shared library
  • ClickStart

2. Configure the Ext JS library as follows:

  • Download ext-2.2.zip file, http://dev.sencha.com/deploy/ext-2.2.zip
  • Put the file in the/var/lib/Oozie/directory of the host running oozie Server.
  • Decompress the file
  • Restart oozie Service

3. Configure the external database here to configure mysql as follows:

  • Select oozie service and clickConfigurationPanel
  • SelectOozie Server Default Group->Database
  • ConfigurationOozie Server Database TypeSelect mysql andOozie Server Database NameThe default value is oozie. SelectOozie Server Database Host address, SelectOozie Server Database User, SelectOozie server data Password. And save the configuration
  • SelectOperation->Stop
  • SelectOperation->Create a database
  • SelectOperation->Start
HUE Configuration

For the configuration process, see Cloudera Manager installation Document # Add the hue service. The difference is that to start the HUE service and HUE depends on other service configurations, You need to modify the following Configuration:

  • Enter CMHostPanel
  • SelectConfiguration->Resource management, ModifyEnable Cgroup-based resource managementProperty, set to true, default to false;
  • EnterYARNService Panel
  • SelectConfiguration->Service Scope, ModifyUse CGroups for resource managementProperty. The default value is false. Always use Linux Container Executor and set it to true. The default value is false;
  • EnterImpalaService Panel
  • SelectConfiguration->Service Scope->Admission Control, ModifyEnable Dynamic Resource PoolsProperty. The default value is false.

Problems encountered

1. during the first installation, due to abnormal shutdown on the memory of machines 98 and 99, half of the installation was terminated and incorrect operations on the machine returned to the first step of installation. However, when you select to enter the cluster of the machine to be installed, three machines added to the cluster are added to the wizard interface, which makes the three machines unavailable. Solution: uninstall and reinstall the three machines. The procedure is as follows:

service cloudera-scm-agent stop

service cloudera-scm-agnet hard_stop_confirmed

yum remove 'cloudera-manager-*'avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-python sqoop sqoop2 whirr

yum clean all

rm -rf /tmp/.scm_prepare_node.lock

rm -rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/solr* /var/lib/zookeeper* /var/lib/spark/

Then you can start by installing the Cloudera Manager Server again.

2.10.100.3.98, 10.100.3.99 the server fails to download and install the CDH-related components. The following message is displayed: network_interfaces info nic iface eth0 doesn't support ETHTOLL (95 ), the IP address is automatically lost when the service network restart is run on machines 98 and 99, because the dynamic IP address is used when the machine is installed and changed to the static IP address.

3.10.100.3.98 and 10.100.3.99 are stuck in 'getting installation lock' while downloading the CDH package. Click the details prompt: Begin Flock 4 Cloudera. The reason is that the Cloudera-manager-agent service has been installed multiple times and the Clouder-scm-agent service has been started, resulting in a lock file. Delete the file. Run the following command:

rm -rf /tmp/.scm_prepare_node.lock

4.10.100.3.98 and 10.100.3.99 fail to download and install CDH components. Click the details prompt:

MainThread agent ERROR HEARbeating to 10.100.3.95: 7182 failed

...

...

AttributeError: 'nonetype 'object has no attribute 'type'

Solution: Enter the relevant machine, restart the cloudera-scm-agent service, and execute the command

service clouder-scm-agent restart

5. After the installation is complete, the cluster HDFS generates an alarm, prompting that 'cluster contains 293 blocks with insufficient copies. A total of 296 pieces are collected. Blocks with insufficient percentage copies: 98.99%. Critical threshold: 40% Under-Replicated Blocks '. The cause is that at the beginning, and 99 machines failed. Only three machines were installed, and only two DataNode nodes were installed, the default configuration is used during the installation process, dfs. replication is set to 3, so the alarm is triggered. When hdfs block information is checked by running the hadoop fsck/command, the Target replica is 3 but found 2 replica (s) data block written during hbase installation is prompted ). Solution: Configure dfs. replication to 2 and execute the following command:

su hdfs

hadoop fs -setrep 2/

6. After HUE is installed, the hue web ui cannot be started. The home page reports an error:

Traceback (most recent call last ):

...

...

ImportError: No module named useradmin

View the/usr/lib/hue/directory to view the app. the soft connection file of the reg file is invalid and the file is not created. The solution is: in the/usr/lib/hue/tools/app_reg/directory, an app_reg.py file is used to generate the registration file and view its syntax, run the command tools/app_reg/app_reg.py -- install apps/xxx/, where xxx is the name of all files in the apps directory. Solve the problem after execution.

This is all the problems encountered during the installation process. Then, the original cluster will be upgraded from CDH3U5 to CDH5.3.2.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.