Installing CDH with Cloudera Manager 5.6

Source: Internet
Author: User
Tags sha1

A brief introduction to CDH

Everyone often says CDH, whose full name is: Cloudera's distribution including Apache Hadoop, simply Cloudera's Hadoop platform, is encapsulated and reinforced on the basis of Apache native Hadoop components. What is there in CDH? Such as:

So how does this CDH software install? Cloudera Company provides a set of software to install CDH, manage and maintain CDH components, called Cloudera Manager (hereinafter referred to as CM). CM itself is a master-slave structure, composed of cm server and cm agent, so you can see later, when installing cm, is to install CM server on a host, and then install CM agent on each host.
The next thing we want to talk about is using cm 5.6来 to install CDH 5.6.
In Cloudera's official online cm installation CDH documentation, several installation methods are described: A, B, C. For production environments, B and C are optional. b is to manually install the CM, and then through the CM automatic installation of other components. and C is the CM and all other components are manually installed by means of tarball. We use the CM to install with tarball, and the other components are installed with cm.

The following does not have a special description, all use the root user action

Environment preparation
    • Turn off the firewall for all hosts, for Suselinux is:
SuSEfirewall2 stop
    • Modify all hosts '/etc/hosts files to write the host name and IP address of all hosts to this file
    • Configure all hosts to SSH password-free login, including native login.
ssh-keygen -t rsacat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

The contents of the ~/.ssh/authorized_keys file in each machine are then appended to the end of the other machine's ~/.ssh/authorized_keys file.

    • Install Oracle JDK 1.7, set the environment variable Java_home, PATH.
      Set in. Profile or. bash_profile
export JAVA_HOME=JAVA安装地址export PATH=.:$JAVA_HOME/bin:$PATH

Make it effective

source .bash_profile
    • Make sure Python is installed and the version is 2.6 or 2.7
MySQL Installation
    • Download a MySQL RPM package, we use: mysql-server-5.5.28-1.linux2.6.x86_64.rpm, the best version is 5.5 or 5.6
      If the system already has a low-level version, perform the following command to uninstall it:
-e mysql --nodeps

And then install:

rpm -ivh MySQL-server-5.5.28-1.linux2.6.x86_64.rpm
    • Configure MY.CNF
      If the/etc/my.cnf file does not exist, you can execute the following command to generate a
touch /etc/my.cnf

The content inside can use the recommended configuration values in the document:

[Mysqld]transaction-isolation = read-committed # Disabling Symbolic-links is recommended to prevent assorted security risks;# to does so, uncomment this line:# symbolic-links = 0Key_buffer =  M key_buffer_size = +  M Max_allowed_packet = +  M thread_stack =  K thread_cache_size =  query_cache_limit =  8M query_cache_size =  M Query_cache_type = 1 max_connections = 550 #expire_logs_days = 10#max_binlog_size = 100m#log_bin should is on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log ' with a appropriate path for your System#and chown the specified folder to the M Ysql user.log_bin=/var/lib/mysql/mysql_binary_log # for MySQL version 5.1.8 or later. Comment out Binlog_format for older versions.Binlog_format = mixed read_buffer_size =  2M read_rnd_buffer_size =  M sort_buffer_size =  8M join_buffer_size =  8M # InnoDB Settingsinnodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size =  M innodb_buffer_pool_size =  4G innodb_thread_concurrency = 8 Innodb_flush_method = o_direct innodb_log_file_size =  M [Mysqld_safe]log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid sql_mode=strict_all_tables 
    • Set up MySQL self-boot
chkconfig --add mysql
    • Start MySQL
service mysql start

If the startup fails, see the "Problems encountered" section later

    • Security Configuration
/usr/bin/mysql_secure_installation

$ sudo/usr/bin/mysql_secure_installation
[...]
Enter current password to root (enter for none):
OK, successfully used password, moving on ...
[...]
Set root Password? [y/n] Y
New Password:
Re-enter new password:
Remove anonymous users? [y/n] Y
[...]
Disallow Root login remotely? [y/n] N
[...]
Remove test database and access to it [y/n] Y
[...]
Reload privilege tables now? [y/n] Y
All done!

    • Installing the JDBC Driver
      For suselinux, download mysql-connector-java-5.1.38.tar.gz, and install:
tar zxvf mysql-connector-java-5.1.38.tar.gzcp mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar /usr/share/java/mysql-connector-java.jar
    • To create a DB instance for another component
      We just need to create hive and activity
      To create hive as an example, log in with MySQL root user after MySQL execution:
‘root‘@‘%‘‘root‘;flush privileges;use hive;
    • If you still need additional components, you can refer to the following three sections of the document:
      • Creating Databases for Activity Monitor, Reports Manager, Hive metastore server, Sentry server, Cloudera Navigator Audit S Erver, and Cloudera Navigator Metadata Server
      • Configuring the Hue Server to Store Data in MySQL
      • Configuring MySQL for Oozie
Installation of Cloudera Manager

Thanks to the tarball installation cm, you can refer to the documentation

    • Unzip the installation file
      Put the downloaded cm file into the/opt directory of CM server, the download page is (http://www.cloudera.com/documentation/enterprise/latest/topics/cm_vd.html# Concept_mb3_sfz_3q_unique_1) If it is suselinux, download Zypper/yast SLES.
tar -xzf cloudera-manager*.tar.gz
    • Create a user on the CM server
useradd --system --home=/opt/cm-5.6.0/run/cloudera-scm-server --shell=/bin/false"Cloudera SCM User" cloudera-scm
    • Create local storage directory for CM server
mkdir /var/lib/cloudera-scm-servermkdir /var/log/cloudera-scm-serverchown cloudera-scm:cloudera-scm /var/log/cloudera-scm-server
    • Configure CM Agent
      To modify the/opt/cm-5.6.0/etc/cloudera-scm-agent/config.ini file on the CM server, simply modify the Server_host to the host name of the CM server

    • SCP the entire folder after decompression to each other host

scp -r /opt/cm-5.6.0 各主机的/opt目录
    • Create a parcel directory
      What is parcel? It can be understood as a CDH installation file, which is read by CM when the CDH is installed.
      • Execute on the CM server host first:

        mkdir -p /opt/cloudera/parcel-repo
        chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
      • Then execute on each cm Agent host:

        mkdir -p /opt/cloudera/parcels
        chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
    • Create a database for CM server
      Execute the following command, the specific meaning of the command can refer to the installation documentation
/opt/cm-5.6.0/share/cmf/schema/scm_prepare_database.sh mysql scm  -hlocalhost -uroot -proot  --scm-host localhost scm scm scm
    • Start the CM Server and set the self-boot
/opt/cm-5.6.0/etc/init.d/cloudera-scm-server start cp /opt/cm-5.6.0/etc/init.d/cloudera-scm-server /etc/init.d/cloudera-scm-server chkconfig cloudera-scm-server  on

Modify the contents of the/etc/init.d/cloudera-scm-server file to change the value of Cmf_defaults from ${cmf_defaults:-/etc/default} to/opt/cm-5.6.0/etc/ Default

    • To start the CM Agent, set the self-boot
/opt/cm-5.6.0/etc/init.d/cloudera-scm-agent startcp /opt/cm-5.6.0/etc/init.d/cloudera-scm-agent /etc/init.d/cloudera-scm-agent chkconfig cloudera-scm-agent on

Modify the contents of the/etc/init.d/cloudera-scm-agent file to change the value of Cmf_defaults from ${cmf_defaults:-/etc/default} to/opt/cm-5.6.0/etc/default

Note: If you also want to start the CM Agent on the CM server host, you must also execute the above command

Installation of CDH

If both the CM server and the CM agent start successfully, we can install the CDH.

    • Parcel preparation of the installation package
      Download the parcel package here and include a total of three files (for suselinux):
      Cdh-5.6.0-1.cdh5.6.0.p0.45-sles11.parcel
      Cdh-5.6.0-1.cdh5.6.0.p0.45-sles11.parcel.sha1
      Manifest.json
      After downloading, place three files in the/opt/cloudera/parcel-repo directory of the CM server host and execute the following command:
mv cdh-5.6 . 0 -1 . Cdh5. 6.0 . P0. 45 -sles11.parcel.sha1 cdh-5.6 . 0 -1 . Cdh5. 6.0 . P0. 45 -sles11.parcel.sha  
    • First use the browser to log on to the CM Server host: http://CM-Server-host:7180, default login username, password is admin,admin
    • If this is the first installation, the version of CM will be selected, we can choose Cloudera Express.
    • Select a host to join the cluster
      This will automatically show the server that started the CM agent process, and if not, check if the Server_host is configured as cm in the/opt/cm-5.6.0/etc/cloudera-scm-agent/config.ini file. The address of the server
    • Select the version of parcel you want to install
      If the parcel version you downloaded is not displayed correctly, please check the section "Preparing the Parcel installation package".
    • Select Java Installation
      This step because we have manually installed the Java SDK, so this step do not select the check box, directly continue
    • Enter the login password for host root
    • Implement the installation
    • Host detection

      The environment of each host is then checked, and the following errors are generally reported:
      • Time is out of sync
        See the "Problems encountered" section will have a solution, here can be regardless of the first. However, it is best to set the time for each machine to be consistent, without having to install the NTP service.
      • Swappiness's Problem
        Follow the prompts to execute the following command:
        bash
        sysctl vm.swappiness=0
      • Host missing user error
        See the "Problems encountered" section for a workaround.

After you resolve these issues, you can choose to rerun the check again.

    • Selecting Components for installation
      Here we select all services
    • Custom Role Assignments
      The main choice here is how to distribute individual components between hosts
    • Set up a database for individual components
      This is about hive and Oozie,
    • Set up a directory for Hadoop
    • Performing the installation
      Some problems may come up here. Need to be solved in an adaptable ...
Problems encountered
    • MySQL startup problem
      Can ' t open and Lock Privilege tables:table ' mysql.host ' doesn ' t exist
      Perform the following command to resolve:
    • The problem of the host time is out of sync
      On the host---Configuration page, enter the clock in the search field and set the warning and severity to never. As shown in the following:

    • Host missing user error
      Check the agent's log and found that the agent failed to create the user, through the search code, found in the following file in the creation of the user:
      /opt/cm-5.6.0/lib64/cmf/agent/src/cmf/parcel.py
      There is a code in it that finds that when using the Useradd command, it uses a-u option, which is not in the Useradd command of the Suselinux operating system and does not know if the useradd of the other OS supports this option.
      The solution is simple, we can comment out the code of 499 lines:

      #umask_arg, umask_param,

487 for user, data in Users.items ():
488 Try:
489 If Self.is_suse:
490 Umask_arg = '-U '
491 umask_param = ' 022 '
492 Else:
493 Umask_arg = '-K '
494 Umask_param = ' umask=022 '
495
496 Useradd_args = ["/usr/sbin/useradd",
497 "-R", "-M",
498 "-G", User,
499 Umask_arg, Umask_param,
500 "–home", data[' home '),
501 "–comment", data[' longname '),
502 "–shell", data[' shell ')]

    • Yarn Start failure Error
      The error message resembles the following:

Traceback (most recent):
File "/opt/cm-5.6.0/lib64/cmf/agent/src/cmf/util.py", line 370, in source
Return Dict ((line.split ("=", 1) for line in Data.splitlines ()))
Valueerror:dictionary Update sequence element #103 has length 1; 2 is required

Someone on the web posted the following solution:

This error is a bug of cm and the workaround is to modify the/opt/cm-5.3.0/lib64/cmf/agent/src/cmf/util.py file. Put the code in it:
Pipe = subprocess. Popen (['/bin/bash ', '-C ', '. %s %s Env "% (path, command)],
Stdout=subprocess. PIPE, Env=caller_env)
Modified to:
Pipe = subprocess. Popen (['/bin/bash ', '-C ', '. %s %s env | grep-v {| grep-v} "% (path, command)],
Stdout=subprocess. PIPE, Env=caller_env)

This method is to filter out the env output, but for my environment is useless, in fact, the code is to save the env output to a dictionary, each line is a key=value form, but if there is only key in env output, there is no equals sign, then insert the dictionary will fail. I saw the output of the print env in Agnet's log, and there was a line like this:

Classpath=/usr/java/java^m/lib

This ^m is a special character, it should be \ r \ n This type, is a newline, resulting in/lib no equal sign, so the solution should be:
Modify the value of an environment variable that is not formatted

Location of important files in cm
    • CM Agent's function file location:/opt/cm-5.6.0/lib64/cmf/agent/src/cmf/agent.py
    • Log location for cm server and cm Agent:/opt/cm-5.6.0/log/
    • CM Agent starts the script location for each component:/opt/cm-5.6.0/lib64/cmf/service/
    • Installation location of each component after installation:/opt/cloudera/parcels/cdh-5.6.0-1.cdh5.6.0.p0.45/lib/xxx, where xxx is the name of the component, for example, the location of Spark is/opt/cloudera/ Parcels/cdh-5.6.0-1.cdh5.6.0.p0.45/lib/spark Directory
    • The location of the configuration files for each component after installation:/etc/xxx/conf, where xxx is the name of the component, for example, the configuration file for Spark is/etc/spark/conf,conf is a soft link that actually points to/etc/alternatives/ Spark-conf
    • The location of each component's run time log:/var/log/xxx, where xxx is the name of the component, such as Oozie's log is in the/var/log/oozie directory
Reference documents
    1. Cloudera Manager and CDH 4 Ultimate Installation
    2. CDH Use of CDH 5.3.x installation
    3. Cloudera Installation and Upgrade

Installing CDH with Cloudera Manager 5.6

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.