Today's Hadoop market lacks unified standards and development vision

Source: Internet
Author: User
Keywords Hadoop Development vision

As the most widely used new http://www.aliyun.com/zixun/aggregation/13568.html "> Large data Technology, Hadoop is critical to a modern business strategy that does not need to strengthen structural industry governance to achieve sustainable development. Technology maturity relies on the concerted efforts of all parties to work together to figure out how to collaborate on a growing number of subprojects within the interior and how to develop other large data specifications and communities externally.

Standard is an essential foundation for the Hadoop industry to mature

If the Hadoop industry does not begin to focus on the development of a truly standardized core, the Hadoop market will not be fully fledged and may face growing obstacles to development and application. The HADOOP Standard Framework provides an ambitious development vision for this technology in the broader context of larger data industries. Standards are critical, with standard Hadoop solution vendors and users to secure cross-platform interoperability.

At the moment, silos dominate the Hadoop world, and this situation will intensify if there is a lack of open interoperability standards. Most Hadoop users have built their own deployment projects based on core Apache Hadoop open source code or a specific vendor's code base distribution. IBM has incorporated the core Apache Hadoop open Stack in the ibm®infosphere®biginsights™ product. However, some vendors disagree on the development of the core Apache Hadoop code base in the context of the solution mix, developing proprietary extensions, tools, and other components. Some, but not all, Hadoop vendors reapply some code to the open source Apache Hadoop community.

The industry's significant deviation from the current state of Hadoop based on strict open source standards has prompted the solution provider to launch enterprise-class products to make up for the functional gaps in the core Apache Hadoop subproject. However, if branching and proprietary extensions continue to occur beyond the explicit standard industry reference implementation projects and interfaces, it is likely that the silo war will be triggered when market-impact vendors try to reinvent proprietary Hadoop to achieve project support by leveraging the first-served advantage.

Let's explore this issue from a historical perspective. In the early 21st century, the service-oriented architecture had not yet begun to mature until industry organizations such as OASIS and WS-I had steadily developed a set of core specifications (such as WSDL, SOAP, etc.) before it began to mature. Today's big data world desperately needs similar drivers to standardize the Hadoop core, a method based on a new key open standard. Hadoop is still in the "fact standard" camp.

Today's Hadoop market lacks standards and a unified vision

The latest version of the open source Apache code base, referred to as Hadoop 2.0, introduces a number of valuable enhancements, including high availability and federated capabilities of the Hadoop Distributed File System (HDFS), and support for the MapReduce standby programming framework. However, to my disappointment, the introduction of these enhancements is not based on the principle of unification.

No organization has taken the lead in presenting a clear vision for the sustainable development of Hadoop. When can you roughly complete the development of various components of Hadoop (MapReduce, HDFS, Pig, Hive, etc.)? What kind of Reference architecture is used to develop other Hadoop services under Apache sponsorship?
In addition, no organization defines where Hadoop fits into a growing group of data technologies. Where is the end point for Hadoop development and where are the starting points for various NoSQL technologies? What requirements and features are valid for other development projects but should be excluded from the Apache Hadoop community development schedule?

And no agency has called for enhanced formal standardization of the various Hadoop technologies within the core Reference Architecture. The Hadoop industry urgently needs to standardize to provide support for Cross-platform interoperability certification. This standardized approach, together with all solution providers that continually integrate the core Apache Hadoop code base, is the only way to ensure that orphaned proprietary implementation projects do not delay the wider application process across the industry.

In the 2nd section of this article, I will explore the functional areas that the Hadoop standard reference framework should focus on.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.