Concurrent upgrade Large Data application framework, SQL interface
Source: Internet
Author: User
KeywordsLarge data application can express
Based on the Apache Hadoop 2.2 released last month, concurrent, a large data application platform expert, today unveiled a new version of cascading--'s large data application framework.
Concurrent also launched cascading lingual 1.0, an Open-source project that provides a full ANSI SQL interface.
Cascading is a stand-alone Open-source Java application framework designed as an alternative API for MapReduce. Cascading allows Java developers to build large data applications on Hadoop using their existing skills.
Chris Wensel, founder and chief technology officer of cascading creator concurrent, said: "I created cascading entirely because of anger, and after I used MapReduce I swore I would never use it again." ”
This latest version of cascading 2.5 adds support for Hadoop 2.2, which includes the new yarn architecture introduced in the Hadoop 2.2 version. The Apache Hadoop YARN (another resource manager), as a Hadoop operating system, uses a single purpose data platform for batching and development as a multipurpose platform-batch, interactive, network, and streaming.
Yarn as the primary resource manager for data stored on the Hadoop Distributed File System (HDFS) and the access mediator, allows organizations to store data in one place and then interact with that data in a variety of ways, with consistent service levels.
Organizations can now use cascading to invest in Java, traditional SQL, and predictive modeling for a single large data-processing application.
The migration path for Hadoop 2
Gary Nakamura, chief executive of concurrent, said cascading did not specialize in yarn, but it could allow users to seamlessly migrate their applications to Hadoop 2 and take advantage of yarn. Domain-specific languages (DSLs) such as scalding, Cascalog, and pycascading can also be seamlessly migrated to Hadoop 2. Similarly, when cascading is on the Hadoop stack, it supports Apache Tez.
Concurrent also improves performance for complex connection operations, optimizes dynamic partitioning, and stores processed data more efficiently on HDFs.
In addition to Cascading,concurrent's announcement that cascading lingual 1.0 is on sale, the product can help to invest heavily in business intelligence (BI) tools (such as Pentaho, Jaspersoft, and Congnos) And trained businesses to quickly access data stored on Hadoop. Lingual allows users to leverage their existing SQL skills and systems to create and run applications on Hadoo.
Concurrent, Wensel of the company, says lingual allows anyone familiar with SQL to instantly access tools stored on Hadoop using their JDBC-compliant bi or preferred desktop tools.
"Cascading is an important component of large data application development ecosystems, lingual is another important development for making it easier to build large data applications," said Steve McPherson, general manager of Amazon Flexible MapReduce (EMR), Amazon Cloud computing services. ”
"Now, Amazon resilient MapReduce customers can use lingual to integrate different data stores from Amazon's cloud computing services with Amazon S3 and redshift services, and they can process this data and store it in the Amazon EMR via standard ANSI SQL commands, "This makes it easier for customers to query data using their favorite bi tools," says McPherson. ”
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.