Hortonworks released a preview release version of the next generation of Apache Hadoop
Source: Internet
Author: User
KeywordsPublish work preview
Hortonworks has released a preview release of the next generation of Apache Hadoop. The Apache Hadoop commitment expands the range of types that can be applied to analysis on a data-processing platform.
The new Apache Yarn Scheduler replaces the MapReduce by providing a more general resource management framework
"Hadoop 2.0 is a fundamental architectural change that makes Hadoop more important than just being a batch platform," says Arun Murthy, founder of Hortonworks and one of the core engineers who develop Hadoop. This updated software will drive a new round of technological innovation.
The Hortonworks Data Platform 2.0 Community Preview (Community preview) contains a number of new components for the Hadoop environment, most notably yarn (verb Another Resource, negotiator, Another resource coordinator). This is the follow-on product of Hadoop's MapReduce job scheduler.
Murthy says Hadoop started as a "single application Platform", mainly based on extracting and indexing site content. Agencies are now seeking to use it for other tasks, such as interactive queries and analysis of real-time data transmission.
Yarn improved MapReduce by expanding the types of work that can be done on the Hadoop platform. MapReduce can only manage bulk processing work, perform data analysis on any number of nodes, and return results when completed.
Yarn, by contrast, is a common resource management framework. Yarn provides a basis for running a non-batch process, such as running real-time data streams indefinitely and working with interactive queries. Users can query the data in the interactive query in the move. Murthy says users can now bulk process MapReduce's work and SQL queries that interact when performing tarn work.
"Using yarn, you have a cluster that is familiar with all the different types of workloads and job requirements," said Shaun Connolly, vice president for corporate strategy, Hortonworks. Therefore, they can coexist. You should not let a job dominate or take over all the resources of this cluster. Previously, organizations had to run different clusters to perform different styles of tasks.
HDP 2.0 also includes a number of new components, including a yarn plug-in Apache Tez that accelerates large and interactive work, and a technology set that provides the ability to run SQL queries in a Hadoop library.
This preview of HDP 2.0 is a complete version of the Hadoop release that can be run in Oracle VirtualBox or VMware Virtual environments.
Hortonworks released HDP 2.0 at the 2013 Hadoop summit in San Jose, Calif., this week. Rackspace announced at the conference that it would provide Hadoop services and that its analysis tools were provided by Pentaho. Splunk publishes a hunk tool to query the Hadoop library. The Data Warehouse system provider Teradata released a new Hadoo application. VMware upgraded the Vsphere virtualization management software to support the Hadoop cluster.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.