Suitable for domestic CDH5 Installation
0. Cluster Planning
Note: CDH allows you to conveniently add and delete hosts and dynamically change the services on the hosts. Therefore, you can allocate the services that run on each machine later.
Three machines in total
Operating System: centos6.5
Machine name: work01, work02, and work03
Work03 run Cloudera Manager
1.
Disable firewall and SELinux
Note: If you do not close,Inter-cluster communication may fail, resulting in service failure. If the production environment needs to be used as an online service.
1.1 disable the Firewall:
Service iptables stop (temporarily disabled)
Chkconfig iptables off (effective after restart)
1.2 disable SELINUX:
Setenforce 0 (temporarily effective) (this method is not running successfully)
Modify selinux = disabled under/etc/SELINUX/config (This method takes effect permanently after restart.
View selinux status:/usr/sbin/sestatus-v
Note:All three machines must perform the same operation.
2. Change the host name with FQDN
Note:
A. All three machines must perform the same operation.
B./etc/sysconfig/network configure the corresponding host name
C./etc/hosts three machines share the same content, so that the three machines can access each other through the host name.
D. If there are many machines, you can configure the DNS server to resolve the host name.
1) modify the/etc/sysconfig/network File
NETWORKING = yes
HOSTNAME = work01
2) modify the/etc/hosts file
192.168.1.185 work01 work01
192.168.1.141 work02 work02
192.168.1.198 work03 work03
3) restart the network service to take effect: service network restart
Restarting the network service during the test will cause network disconnection and will not automatically connect. You need to click the connection icon to connect again. Please proceed with caution.
3. Password-less ssh Login for machines across clusters
Note:
A. Some files will be copied between machines through ssh, and some Service Startup commands will be sent to create a password-less ssh Login between clusters. You do not need to enter a lot of passwords every time you start the service.
B. It seems that Cloudera Manager has managed the logon password. This step may be skipped. If you are interested, try it.
C. ssh password-less login principle is to generate a pair of public keys and keys, give the public key to others, and others can access themselves with or without a password. For example, if A gives the generated public key to B, then B can access A without A password.
D. The generated public key is id_rsa.pub. the public key of the machine to be accessed is saved in the authorized_keys file.
E. To save the public keys of multiple machines, add them to authorized_keys as an append.
1) switch the root account on work01
Su
2) generate the key and public key of the root account on work01
Ssh-keygen-t rsa
Press enter to generate the Public Key id_rsa.pub and the key id_rsa.
3) generate the key and public key of the root account on work02 and work03
4) copy the public key files on work02 and work03 to work01.
[Root @ work02 ~] # Scp ~ /. Ssh/id_rsa.pub root @ work01 :~ /. Ssh/work02.pub
[Root @ work03 ~] # Scp ~ /. Ssh/id_rsa.pub root @ work01 :~ /. Ssh/work03.pub
Differentiate file names during copying
5) Add the public keys of work01, work02, and work03 to the authorized_keys file of work01.
Catid_rsa.pub> authorized_keys
Cat work02.pub> authorized_keys
Catwork03.pub> authorized_keys
6) copy the authorized_keys file on work01 to work02 and work03.
[Root @ work01 ~] # Scp ~ /. Sshauthorized_keys root @ work02 :~ /. Ssh/
[Root @ work01 ~] # Scp ~ /. Sshauthorized_keys root @ work03 :~ /. Ssh/
Note:Password-less logon is only valid for accounts that generate public keys. Note that the accounts that generate public keys must be the same as those that require remote service startup.
4. yum source configuration
Note: The yum source provided by the system is abroad, and the software installation process will be slow. configuring the yun source in China can increase the installation speed.
1) Go to the yum source configuration directory.
Cd/etc/yum. repos. d
2) yum source provided by the backup system
Mv CentOS-Base.repo CentOS-Base.repo.bk
3) download the 163 yum Source:
Wget http://mirrors.163.com/.help/CentOS-Base-163.repo
Mv CentOS6-Base-163.repo CentOS-Base.repo
3) after the yum source is updated, run the following command to update the yum configuration so that the operation takes effect immediately.
Yum makecache
Yum clean all
5. Download the CDH parcels installation package
Note:
A. centos 6. x application CDH version is CDH-xxxx-el6.parcel, centos 5. x application CDH version is CDH-xxxx-el5.parcel
B. cloudera Manager Automatically downloads the file. Due to network speed problems, the download process is slow and may last for several hours. If an error occurs, the file will be downloaded from the beginning. Early download can speed up the installation. Step 1 of the configuration method is introduced.
Download link: http://archive.cloudera.com/cdh5/parcels/latest/
Download CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel
And manifest. json
6. Install Cloudera Manager
Note:
A. the required rpm files are automatically downloaded from the installation file of Cloudera Manager. However, the installation process is slow because the yum source of these files is abroad, therefore, you can manually download these rpm files to increase the download speed.
B. Run the Cloudera Manager Installation File to obtain the desired rpm file address.
6.1 download the cloudera Manager Installation File
: Http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
6.2 run the Cloudera Manager Installation File
Chmod u + x cloudera-manager-installer.bin
/Cloudera-manager-installer.bin
6.3 obtain the rmp file to be installed
1) enter the yum source directory
Cd/etc/yum. repos. d
2) check whether the cloudera-manager yum source file has been downloaded.
An additional cloudera-manager.repo File
3) Get the rpm
Cat cloudera-manager.repo
Where rpm is: baseurl = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
6.4 close the Cloudera Manager Installation Wizard
1) Close cloudera-manager-installer.bin
2) Kill the yum process started by the Cloudera Manager Installation Wizard.
Ps aux | grep yum (obtain the yum process number started by the cm Installation Wizard)
Kill xxxx (kill the corresponding process by process number)
6.5 manually download the corresponding rpm file (a total of 7 files)
Download from address 6.3: http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/
| |
Name |
Last modified |
Size |
Description |
| |
| |
Parent Directory |
|
- |
|
| |
Cloudera-manager-agent-5.0.2-1.cm502.p0.297.el6.x86_64.rpm |
11-Jun-2014 |
3.7 M |
|
| |
Cloudera-manager-daemons-5.0.2-1.cm502.p0.297.el6.x86_64.rpm |
11-Jun-2014 |
315 M |
|
| |
Cloudera-manager-server-5.0.2-1.cm502.p0.297.el6.x86_64.rpm |
11-Jun-2014 |
8.0 K |
|
| |
Cloudera-manager-server-db-2-5.0.2-1.cm502.p0.297.el6.x86_64.rpm |
11-Jun-2014 |
9.6 K |
|
| |
Enterprise-debuginfo-5.0.2-1.cm502.p0.297.el6.x86_64.rpm |
11-Jun-2014 |
669 K |
|
| |
Jdk-6u31-linux-amd64.rpm |
11-Jun-2014 |
68 M |
|
| |
Oracle-j2sdk1.7-1.7.0 + update45-1.x86_64.rpm |
11-Jun-2014 |
131 M |
|
| |
6.6 manually install the downloaded rpm file
Yum localinstall -- nogpgcheck *. rpm
6.7 run the Cloudera Manager installation file again
Two errors occurred during running:
1) Problem description: fatal erro
Solution: rm-rf/usr/share/cmf/
2) Problem description: Installation failed. Failed to start Embedded Service and Configuration Database, See vim/var/log/cloudera-manager-installer/5. start-embedded-db.log for details.
Bash:/usr/share/cmf/bin/initialize_embedded_db.sh: No such file or directory
Solution: reboot Installation Wizard error not reproduced
7. Configure the CDH parcels package
Note:
A. There are two ways to install CDH using Cloudera Manager. One is to use the rpm package and the other is to use the parcels package. This test uses the parcels package.
B. Cloudera Manager Automatically downloads the required parcels package, but the connection speed is slow because it connects to a foreign site.
C. Configure the CDH parcels file downloaded in step 1 so that Cloudera Manager can directly read the local parcels file.
7.1 put the previously downloaded CDH parcels file in the/opt/cloudera/parcel-repo directory
7.2 generate the corresponding sha File
1) Find the corresponding hash value in the manifest. json file downloaded in step 1 according to the version "CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel"
"Hash": "67fc4c86b260eeba15c339f1ec6be3b59b4ebe30"
2) the hash value is stored in the sha file.
Echo '67fc4c86b260eeba15c339f1ec6be3b59b4ebe30'> CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel.sha
8. Start Cloudera Manager
Note: a. Follow the prompts in the Cloudera Manager Installation Wizard to open Cloudera Manager. B. The CDH Installation Wizard will be started for the first time. configure it according to the wizard.
The following problems occur during installation. For specific solutions, see "problem list" Problem 1:
Python-c 'import socket; import sys; s = socket. socket (socket. AF_INET); s. settimeout (5.0); s. connect ("localhost", int (7182); s. close ();'
9. Add a service
Note: a. Only HDFS and HBase are installed in this test. You can use Cloudera Manager to quickly add and uninstall services. c. When adding services, the system will prompt whether the dependent services have been installed.
Reference:
Note: All reference documents should be listed as much as possible. If any omission exists, please be advised.
Cloudera Manager and CDH 4 ultimate installation http://www.tuicool.com/articles/AnuiUra
C? L? O? U? D? E? R? A? M? A? N? A? G? E? R? And? C? D? H? 4? Ann? Pack: http://wenku.baidu.com/link? Url = SOOI3r56NN7Un55Z3jsNprQp9PpOc-F8_ByXPJ7v4GJmAioEMLM6vL0Hkc2c0HSxztlWWvPOA13Grs1vf2-0wJdbueQfbEAvuNbGIldxxou
CDH kit semi-manual installation flow http://www.douban.com/note/352772895/
Install the CDH Hadoop cluster with yum (cdh5 disables ipv6, hostname settings, yum source, clock sync): http://blog.javachen.com/2013/04/06/install-cloudera-cdh-by-yum/
View SELinux status and disable SELinux: http://bguncle.blog.51cto.com/3184079/957315
Modify yum Source: http://www.cnblogs.com/liuling/p/2014-4-14-001.html in CentOS6.5
Problem list:
Problem 1 PTR localhost:
Description:
DNS reverse resolution error. The Cloudera Manager Server host name cannot be correctly parsed.
Logs:
Detecting Cloudera Manager Server...
Detecting Cloudera Manager Server...
BEGIN host-t PTR 192.168.1.198
198.1.168.192.in-addr. arpa domain name pointer localhost.
END (0)
Using localhost as scm server hostname
BEGIN which python
/Usr/bin/python
END (0)
BEGIN python-c 'import socket; import sys; s = socket. socket (socket. AF_INET); s. settimeout (5.0); s. connect (sys. argv [1], int (sys. argv [2]); s. close (); 'localhost 7182
Traceback (most recent call last ):
File "<string>", line 1, in <module>
File "<string>", line 1, in connect
Socket. error: [Errno 111] Connection refused
END (1)
Cocould not contact scm server at localhost: 7182, giving up
Waiting for rollback request
Not elegant solution:
Delete the host/usr/bin/host file that cannot be connected
BEGIN host-t PTR 192.168.1.198
/Tmp/scm_prepare_node.8OX5y7is/scm_prepare_node.sh: line 100:/usr/bin/host: insufficient Permissions
END (126)
BEGIN which python
/Usr/bin/python
END (0)
BEGIN python-c 'import socket; import sys; s = socket. socket (socket. AF_INET); s. settimeout (5.0); s. connect (sys. argv [1], int (sys. argv [2]); s. close (); '192.168.1.198 7182
END (0)
BEGIN which wget
/Usr/bin/wget
END (0)
BEGIN wget-qO--T 1-t 1 http: // 169.254.169.254/latest/meta-data/public-hostname &/bin/echo
END (4)
Note:
I don't understand the original intention of cloudera. Here I have obtained the ip address of the Cloudera Manager Server, but I have to resolve the ip address to the host name to connect to it.
Because DNS reverse resolution is not configured properly, the localhost is obtained after resolving the host name based on the ip address of the Cloudera Manager Server, resulting in subsequent connection errors.
The solution here is to delete/usr/bin/host directly, so that Cloudera Manager will directly use the ip address for connection, so there is no error
Refer:
Cloudera manager 4.8
Http://www.reader8.cn/jiaocheng/20140419/2307406.html
Question 2 NTP:
Question 2.1
Problem description:
Bad Health -- Clock Offset
The host's NTP service did not respond to a request for the clock offset.
Solution:
Configure NTP service
Step reference:
Configure NTP Server for CentOS:
Http://www.hailiangchen.com/centos-ntp/
Common NTP server addresses and IP addresses in China
Http://www.douban.com/note/171309770/
Modify the configuration file:
[Root @ work03 ~] # Vim/etc/ntp. conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html ).
Server s1a.time.edu.cn prefer
Server s1b.time.edu.cn
Server s1c.time.edu.cn
Restrict 172.16.1.0 mask 255.255.255.0 nomodify <=== allow access to lan sources
Start ntp
# Service ntpd restart <=== start the ntp service
Client synchronization time (work02, work03 ):
Ntpdate work01
Note: It takes about five minutes to start the NTP service. If the client synchronization time is set before the service is started, the error "no server suitable for synchronization found" appears"
Scheduled synchronization time:
Configure crontab scheduled synchronization time on work02 and work03
Crontab-e
00 12 * root/usr/sbin/ntpdate 192.168.56.121>/root/ntpdate. log 2> & 1
Question 2.2
Description:
Clock Offset
Ensure that the host's hostname is configured properly. ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules ). ensure that ports 9000 and 9001 are free on the host being added. check agent logs in/var/log/cloudera-scm-agent/on the host being added (some of the logs can be found in the installation details ).
Problem locating:
Run 'ntpdc-c loopinfo' on the corresponding host (work02, work03'
[Root @ work03 work] # ntpdc-c loopinfo
Ntpdc: read: Connection refused
Solution:
Enable the ntp service:
Start the ntp service on all three machines
Chkconfig ntpd on
Question 3 heartbeat:
Error message:
Installation failed. Failed to receive heartbeat from agent.
Solution: Disable the Firewall
Question 4 Unknow Health:
Unknow Health
After restart: Request to
Host MonitorFailed.
Service -- status-all | grep clo
Check the status of scm-agent on the machine: cloudera-scm-agent dead but pid file exists
Solution: restart the service.
Service cloudera-scm-agent restart
Service cloudera-scm-server restart
Question 5: canonial name hostname consistent:
Bad Health
The hostname and canonical name for this host are not consistent when checked from a Java process.
Canonical name:
4092 Monitor-HostMonitor throttling_logger WARNING (29 skipped) hostname work02 differs from the canonical name work02.xinzhitang.com
Solution: Modify the hosts so that the FQDN and hostname are the same.
Ps: The Host Name and host alias must be the same.
/Etc/hosts
192.168.1.185 work01 work01
192.168.1.141 work02 work02
192.168.1.198 work03 work03
Question 6 Concerning Health:
Concerning Health Issue
-- Network Interface Speed --
Description: The host has 2 network interface (s) that appear to be operating at less than full speed. Warning threshold: any.
Details:
This is a host health test that checks for network interfaces that appear to be operating at less than full speed.
A failure of this health test may indicate that network interface (s) may be configured incorrectly and may be causing performance problems. use the ethtool command to check and configure the host's network interfaces to use the fastest available link speed and duplex mode.
Solution:
This test modified the Cloudera Manager configuration, which is not a real solution.