Hadoop version Changes

Source: Internet
Author: User

By May 2012, the four main branches of Apache Hadoop comprise the four series of Hadoop versions.

1.0.20.X Series

0.20.X Series versions are the most confusing to users because they have some features that are not on the trunk, some features on the trunk, and 0.20.X series versions.

2.0.21.0/0.22.x Series

In this release, the entire Hadoop project is split into three separate modules, common, HDFs, and MapReduce.

Both HDFs and MapReduce have dependencies on the common module, but MapReduce has no dependency on HDFs. Thus, MapReduce can run other Distributed file systems more easily, while the modules can be independently developed.

Common module: The biggest new feature is the addition of the large-scale automated test framework and the Fault injection framework for testing.

HDFS modules: The main added features include support for append operations and establishing symbolic connections, secondary NameNode improvements (secondary NameNode is removed, instead of Checkpoint node, and a Backup node is added Roles, as NameNode cold), allowing users to customize the block placement algorithm, and so on.

MapReduce module: In the Job API, start the new MapReduce API, but the old API is still compatible.

3.0.23.X Series

0.23.X is designed to overcome the shortcomings of Hadoop in terms of extensibility and framework versatility. It is actually a completely new platform, including the Distributed File System HDFS Federation and the resource management framework YARN, which can be used for unified management of various computing frameworks (such as MapReduce, Spark , etc.) for access. Its release comes with the MapReduce library, which integrates all the new features of MapReduce to date.

4.2.X Series

Like the 0.23.X series, the 2.X series belongs to the next generation of Hadoop. Compared to the 0.23.X series, the 2.X series adds new features such as NameNode HA and wire-compatibility.

Hadoop version Changes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.