How to choose the right Hadoop version for your business

Source: Internet
Author: User
Keywords Patch http choose for you

Because Hadoop is still in the early stages of high-speed development, plus it is open source, its version has been very confusing, some of the main features of Hadoop:

Append: Support file append function, if want to use http://www.aliyun.com/zixun/aggregation/13713.html ">hbase, need this feature."

RAID: Reduces the number of blocks of data by introducing a checksum code to ensure data reliability. Detailed Link: https://issues.apache.org/jira/browse/HDFS/component/12313080

Symlink: Support HDFS file links, specific reference: https://issues.apache.org/jira/browse/HDFS-245

Security:hadoop security, specific reference: https://issues.apache.org/jira/browse/HADOOP-4487

Namenode HA: Specific reference: https://issues.apache.org/jira/browse/HDFS-1064

HDFS Federation and Yarn

The following is the version evolution of Hadoop:

Apache version Download:

Version Description: http://hadoop.apache.org/releases.html

Download stable version: Find a mirror, download the version under the Stable folder

The most complete version of Hadoop: http://svn.apache.org/repos/asf/hadoop/common/branches/, which can be directly directed to eclipse

Cloudera Release:

From the above we can know that the current version of Apache management is more chaotic, various versions are endless, so many beginners are overwhelmed, in contrast, Cloudera Company's Hadoop version of the management of a lot.

We know that Hadoop complies with the Apache Open source protocol, users can freely and arbitrarily use and modify Hadoop, and therefore, there are many versions of Hadoop, the more famous one is the Cloudera company's distribution, we call this version of the CDH ( Cloudera distribution Hadoop). So far, there have been 4 versions of CDH, of which the top two are no longer updated, the most recent two, CDH3 (evolved on the basis of the Apache 0.20.2 version) and CDH4 based on the Apache Hadoop 2.0 version), corresponding to Apache Hadoop 1.0 and Hadoop 2.0, they are updated every once in a while.

Cloudera A patch level, such as patch levels of 923.142, adds 1065 patch based on the original Apache Hadoop 0.20.2 (These patch are the contributions of individual companies or individuals, There are records on the Hadoop Jira, 923 of which are patch added in the last beta release, and 142 are newly added patch after the stable release. Thus, the higher the patch level, the more complete the function and the more bugs solved.

The Cloudera version is more clear, and it provides a Hadoop installation package for a variety of operating systems that can be installed directly using apt or yum commands.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.