The OS version is centos6.5, And the 64-bit JDK 1.6 also supports python 2.6 or later versions of JDK 1.7. Before installation, disable serviceiptablesstopchkconfigiptablesoff to disable SELINUXvietcselin.
Version Conventions: operating system version centos6.5, the 64-bit jdk1.6 version also supports jdk1.7 python version 2.6 or 2.7 cluster version cdh5.3.2 cloudera manager 5.3 mysql5.0 or later versions. Before installation, the firewall is disabled by SELINUX vi. /etc/selin
Version conventions
Centos6.5, 64-bit operating system version
Jdk1.6 also supports jdk1.7.
Python version 2.6 or 2.7
Cluster version: cdh5.3.2
Cloudera manager 5.3
MySQL or later versions
CM installation instructions
Service iptables stop chkconfig iptables off
Set selinux = disabled in vi/etc/SELINUX/config
- Install mysql before installing coudera manager, configure the database configuration file, and create the corresponding database.
- Change the default storage engine of the mysql database to innodb.
CM Installation Method
The installation method is yum. Because the Intranet machine cannot access the Internet, we must build a local yum warehouse, yum warehouse machine 10.100.3.17.
Build a local YUM Repository
The rpm installation package required for yum installation includes the cloudera manager 5 related installation package: http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.3.2/RPMS/x86_64,
CDH5.3.2 installation package,: http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.3.2/RPMS/x86_64/,http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.3.2/RPMS/noarch/
Upload the relevant installation package to the machine 10.100.3.17/var/ftp/pub/Packages directory, and then execute the createrepo command:
createrepo -g /var/ftp/pub/repodata/repomd.xml /var/ftp/pub/
|
Configure the local yum source for the cdh Cluster machine:
cd /etc/yum.repos.d/
Rm-rf * # Delete useless configuration yum source file
Vi ftp-server.repo # Add ftp-server.repo file to add the following Configuration:
[base]
name=ftp-server
baseurl=ftp: //10.100.3.17/pub/
gpgcheck= 0
|
How to install cdh cluster in Clouera Manager
There are three cdh cluster machines: 10.100.3.95, 10.100.3.96, 10.100.3.97, 10.100.3.98, and 10.100.3.99.
Deploy the cloudera manager agent on these five machines respectively,
Deploy cloudera manager server and mysql on 10.100.3.95
Install jdk
First, check whether openJDK has been installed on the Cluster machine. If yes, uninstall and run the following command:
rpm -qa | grep jdk
Rpm-e xxx # xxx indicates the rpm package name output in the previous step.
|
Install jdk on all machines, configure JAVA_HOME, and run the following command:
yum install jdk
Vi/etc/profile # Add the following configuration
export JAVA_HOME=/usr/java/jdk1. 6.0 . 31
export PATH=$JAVA_HOME/bin:$JAVA_HOME/lib:$PATH
# Make the configuration take effect
source /etc/profile
|
Configure NTP service
We need to configure the ntp Time Synchronization for the cluster. After the cluster is installed, Cloudera Manager checks the time synchronization of the cluster. If the cluster is not synchronized, an alarm is triggered.
Bad Health -- Clock Offset
The host's NTP service did not respond to a request for the clock offset.
We use 10.100.3.95 as the master machine. All machines synchronize the time on this machine, and all machines install the ntp service.
Configure NTP Server on 95 machines, modify the/etc/ntpd. conf file and add the following Configuration:
restrict 0.0 . 0.0 master 0.0 . 0.0 Nomodify notify # enable all network segments to synchronize the Time of the machine
server 127.127 . 1.0
fudge 127.127 . 1.0 stratum 8
|
Start the NTP service
/etc/init.d/ntpd start
chkconfig ntpd on
|
Other machines synchronize the Time of the machine and enable the ntpd service. If the ntpd service is not enabled for other machines, the Cloudera Manager will also issue an alarm, cloudera Manager uses the ntpdc-c loopinfo command to determine the cluster latency. The cluster synchronization Time Command is:
ntpdate 10.100 . 3.95
# Add commands to crontab
crontab -e
*/ 15 * * * * ntpdate 10.100 . 3.95
|
Install Mysql
Cloudera Manager manages service information and cluster configuration information through databases. You can use a built-in PostgreSQL or external database system. Currently, Mysql, Oracle, and external PostgreSQL databases are supported. Here we install the external Mysql database.
$ yum install mysql mysql-devel mysql-server
# Start mysql after installation;
$ service mysqld start
|
# Configure the mysql database and add the following content. Restart the mysql database. If no error is reported, the configuration is successful;
[mysqld]
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links = 0
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system and chown the specified folder to the mysql user.
#log_bin=/var/lib/mysql/mysql_binary_log
#expire_logs_days = 10
#max_binlog_size = 100M
# For MySQL version 5.1 . 8 or later. Comment out binlog_format for older versions.
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
|
Remove these two files:/var/lib/mysql/ib_logfile0;/Var/lib/mysql/ib_logfile1 and restart the mysql service.
Install MySQL JDBC ctor
Driver: Middleware.
$ mkdir -p /usr/share/java/
$ cp mysql-connector-java- 5.1 . 17 .jar /usr/share/java/mysql-connector-java.jar
|
Configure Mysql
Set the password of the root account:
$ sudo /usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!
|
Create a Mysql database
Create a database to save the service configurations of Activity Monitor, Report Manager, Hive MetaStore Server, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.
Log on to the Mysql database as a root user
$ mysql -u root -p
Enter password:
|
Create a database
mysql> create database database DEFAULT CHARACTER SET utf8;
Query OK, 1 row affected ( 0.00 sec)
mysql> grant all on database.* TO 'user' @ '%' IDENTIFIED BY 'password' ;
Query OK, 0 rows affected ( 0.00 sec)
|
The following table lists the databases, users, and passwords.
Activity Monitor |
Amon |
Amon |
Amon_password |
Reports Manager |
Rman |
Rman |
Rman_password |
Hive Metastore Server |
Metastore |
Hive |
Hive_password |
Sentry Server |
Sentry |
Sentry |
Sentry_password |
Cloudera Navigator Audit Server |
Nav |
Nav |
Nav_password |
Cloudera Navigator Metadata Server |
Navms |
Navms |
Navms_password |
Role |
Database |
User |
Password |
Install Cloudera Manager Server
$ yum install cloudera-manager-daemons
$ yum install cloudera-manager-server
|
Configure the Cloudera Manager Server Database
Cloudera Manager Server and Mysql are installed on the same machine and run the following command:
/usr/share/cmf/schema/scm_prepare_database.sh mysql -uroot -p --scm-host localhost scm scm scm
|
If Successful is displayed, the execution is Successful.
Start the Cloudera Manager Server Service
service cloudera-scm-server start
|
You do not need to manually install the Cloudera Manager agent and directly go to the CM web interface to automatically install the configuration.
Web login
Url format: http :// :
Open your browser and enter url, http: // 10.100.3.95: 7180
Default User name: admin Password: admin
Install Cloudera Manager Agent and CDH Components
1. Select the free version of Cloudera Express based on the CM boot interface. Click Next to install the specified host for the CDH cluster.
2. Enter the IP address of the machine on which the cluster is to be installed, including the Cloudera Manager Server.
3. select the cluster installation method, select "use data packets", select "Custom" for the CDH version, enter the yum source address ftp: // 10.100.3.17/pub/, and select "Custom" for the Cloudera Manager Agent, enter the yum source address ftp: // 10.100.3.17/pub /. Click continue
4. Cluster installation status. The installation status of each cluster is displayed. If the cluster is normal, proceed to the next step.
5. select the CDH component to be installed, install HBase, HDFS, Hive, Hue, Key-Value Store Indexer, Oozie, Solr, Spark, Sqoop 2, YARN, and Zookeeper services. Click continue
6. CM detects the installation environment and prompts an installation warning: cloudera recommends setting/proc/sys/vm/swappiness to 0 and current to 60, run the following command on each machine in the Cluster:
echo 0 > /proc/sys/vm/swappiness
|
7. Select the role assignment of the Cluster machine. For the default options, you can select the Master (10.100.3.95) machine. Of course, the Second NameNode can be selected on a non-NameNode machine. Note that Cloudera Management Service selects Master (10.100.3.95), that is, the host that installs mysql. Because mysql is not installed on other hosts, click Continue.
8. Database Configuration. Select the corresponding service based on the created data table.
9. Cluster settings. Select the default value to start cluster installation.
Oozie Configuration
After oozie is installed, You need to configure the following:
1. Install the Oozie shared library as follows:
- Select oozie Service
- ClickOperation->Stop
- ClickInstall Oozie shared library
- ClickStart
2. Configure the Ext JS library as follows:
- Download ext-2.2.zip file, http://dev.sencha.com/deploy/ext-2.2.zip
- Put the file in the/var/lib/Oozie/directory of the host running oozie Server.
- Decompress the file
- Restart oozie Service
3. Configure the external database here to configure mysql as follows:
- Select oozie service and clickConfigurationPanel
- SelectOozie Server Default Group->Database
- ConfigurationOozie Server Database TypeSelect mysql andOozie Server Database NameThe default value is oozie. SelectOozie Server Database Host address, SelectOozie Server Database User, SelectOozie server data Password. And save the configuration
- SelectOperation->Stop
- SelectOperation->Create a database
- SelectOperation->Start
HUE Configuration
For the configuration process, see Cloudera Manager installation Document # Add the hue service. The difference is that to start the HUE service and HUE depends on other service configurations, You need to modify the following Configuration:
- Enter CMHostPanel
- SelectConfiguration->Resource management, ModifyEnable Cgroup-based resource managementProperty, set to true, default to false;
- EnterYARNService Panel
- SelectConfiguration->Service Scope, ModifyUse CGroups for resource managementProperty. The default value is false. Always use Linux Container Executor and set it to true. The default value is false;
- EnterImpalaService Panel
- SelectConfiguration->Service Scope->Admission Control, ModifyEnable Dynamic Resource PoolsProperty. The default value is false.
Problems encountered
1. during the first installation, due to abnormal shutdown on the memory of machines 98 and 99, half of the installation was terminated and incorrect operations on the machine returned to the first step of installation. However, when you select to enter the cluster of the machine to be installed, three machines added to the cluster are added to the wizard interface, which makes the three machines unavailable. Solution: uninstall and reinstall the three machines. The procedure is as follows:
service cloudera-scm-agent stop
service cloudera-scm-agnet hard_stop_confirmed
yum remove 'cloudera-manager-*' avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-python sqoop sqoop2 whirr
yum clean all
rm -rf /tmp/.scm_prepare_node.lock
rm -rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/solr* /var/lib/zookeeper* /var/lib/spark/
|
Then you can start by installing the Cloudera Manager Server again.
2.10.100.3.98, 10.100.3.99 the server fails to download and install the CDH-related components. The following message is displayed: network_interfaces info nic iface eth0 doesn't support ETHTOLL (95 ), the IP address is automatically lost when the service network restart is run on machines 98 and 99, because the dynamic IP address is used when the machine is installed and changed to the static IP address.
3.10.100.3.98 and 10.100.3.99 are stuck in 'getting installation lock' while downloading the CDH package. Click the details prompt: Begin Flock 4 Cloudera. The reason is that the Cloudera-manager-agent service has been installed multiple times and the Clouder-scm-agent service has been started, resulting in a lock file. Delete the file. Run the following command:
rm -rf /tmp/.scm_prepare_node.lock
|
4.10.100.3.98 and 10.100.3.99 fail to download and install CDH components. Click the details prompt:
MainThread agent ERROR HEARbeating to 10.100.3.95: 7182 failed
...
...
AttributeError: 'nonetype 'object has no attribute 'type'
Solution: Enter the relevant machine, restart the cloudera-scm-agent service, and execute the command
service clouder-scm-agent restart
|
5. After the installation is complete, the cluster HDFS generates an alarm, prompting that 'cluster contains 293 blocks with insufficient copies. A total of 296 pieces are collected. Blocks with insufficient percentage copies: 98.99%. Critical threshold: 40% Under-Replicated Blocks '. The cause is that at the beginning, and 99 machines failed. Only three machines were installed, and only two DataNode nodes were installed, the default configuration is used during the installation process, dfs. replication is set to 3, so the alarm is triggered. When hdfs block information is checked by running the hadoop fsck/command, the Target replica is 3 but found 2 replica (s) data block written during hbase installation is prompted ). Solution: Configure dfs. replication to 2 and execute the following command:
su hdfs
hadoop fs -setrep 2 /
|
6. After HUE is installed, the hue web ui cannot be started. The home page reports an error:
Traceback (most recent call last ):
...
...
ImportError: No module named useradmin
View the/usr/lib/hue/directory to view the app. the soft connection file of the reg file is invalid and the file is not created. The solution is: in the/usr/lib/hue/tools/app_reg/directory, an app_reg.py file is used to generate the registration file and view its syntax, run the command tools/app_reg/app_reg.py -- install apps/xxx/, where xxx is the name of all files in the apps directory. Solve the problem after execution.
This is all the problems encountered during the installation process. Then, the original cluster will be upgraded from CDH3U5 to CDH5.3.2.