1, Apache Hadoop deficiencies
• Version Management clutter
• Cumbersome deployment process and complex upgrade process
• Poor compatibility
• Low Safety
2. Hadoop Release version
Apache Hadoop
Cloudera ' s distribution including Apache Hadoop (CDH)
hortonworks Data Platform (HDP)
MAPR
EMR
• ...
3, CDH can solve what problems
• 1000 server clusters, at least how long it takes to build a Hadoop cluster, including Hive, Hbase, Flume, Kafka, Spark, and more
• Only give you one day to complete the above work?
• For Hadoop version upgrades for the above clusters, what upgrade options do you choose, and at least how long will it take?
• New versions of Hadoop that are compatible with hive, Hbase, Flume, Kafka, Spark, and more?
4, CDH Introduction
Cloudera ' s distribution, including Apache Hadoop
• One of Hadoop's many branches, maintained by Cloudera, built on a stable version of Apache Hadoop
• Provides the core of Hadoop
– Scalable Storage
– Distributed Computing
• Web-based user interface
5, the advantages of CDH
• Clear Version Division
• Faster version updates
• Support for Kerberos security authentication
• Clear Documentation
• Supports multiple installation methods (Cloudera Manager mode)
6, CDH installation method
Cloudera Manager
Yum
rpm
Tarball
7, CDH
cdh5.4
http://archive.cloudera.com/cdh5/
Cloudera Manager5.4.3:
Http://www.cloudera.com/downloads/manager/5-4-3.html
CDH 1, CDH Introduction