New breakthrough! Hadoop2.0 official release

Source: Internet
Author: User
Keywords Big data Hadoop 2.0
Tags access application applications big data data data warehouse development hadoop

Hadoop has long been a synonym for big data. However, with the recent development of the application of big data, everyone has become more and more inclined to regard it as a storage tool for big data.

But this is not necessarily a bad thing. Hadoop as cheap and effective storage is just the perfect starting point for the next phase of Hadoop's evolution. Hadoop 2.0, to be unveiled this summer, will make information in the data warehouse and unstructured data pools as easy to access as ever.

Hadoop vat

Hadoop is a great data storage system since it became a big data tool, but MapReduce, which requires Java applications to access data, can be difficult to learn.

Of course, there's another way to get information from Hadoop. Hbase data is part of Hadoop, which allows users to process data according to the database paradigm. Hive data warehouse allows you to create queries and convert them to MapReduce tasks using the SQL HiveSQL query language. However, Hadoop is still limited to single-threaded. MapReduce tasks, Hive queries, Hbase operations, etc., all take turns.

That's why many big data vendors tend to think of Hadoop as a data container only, and on the basis of that, they develop their own tools to capture or analyze the data. Although Hadoop is portrayed as a vat, there are already many Hadoop users who view it as the Great Lakes data or the ocean of data. However, the scale is large or not, those restrictions affect the Hadoop selling point.

Hadoop's development community is also aware of this issue, with the upcoming iteration of Hadoop to a new version, the above restrictions will soon be largely lifted.

YARN solution

According to Arun Murthy, Hadoop 2.0 release manager, the most important change is the upgrade of the MapReduce framework to Apache YARN, which will extend the range of software and applications available in Hadoop. Arun Murthy, himself a YARN project manager, points out that the difference between Hadoop 1.0 and 2.0 is that the former is all batch-oriented, while the latter allows multiple applications to access the data internally at the same time.

Relative to what the current MapReduce system can handle, separating these capabilities makes the management of Hadoop cluster resources even more powerful. The main management method is similar to the operating system to handle the task, that is no longer have an operation limit.

With YARN, developers are able to develop applications directly within Hadoop, rather than filtering out data as many third-party tools do.

Murthy said there are already suppliers interested in developing applications within the YARN framework. Murthy estimates that the powerful beta of Hadoop 2.0 is likely to be released in June or July of this year, and the official version may be released in August.

If YARN does deliver on its promise, developers will have easy access to the vast data sea of ​​native Hadoop platforms, making the task of finding useful information smoother and easier. By then, big data will become more useful and more popular.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.