The five pitfalls of Hadoop

Source: Internet
Author: User
Keywords nbsp Myths solutions data warehousing truth

Apachehadoop helps companies cope with one of their toughest challenges-creating value with massive amounts of data. Users generally deploy the Hadoop framework because it helps businesses gain value from a wide variety of different types of large data. "Forrester Wave: Large Data Hadoop solution" (2014 first quarter edition) released by Forresterresearch, an independent analysis agency, shows that Hadoop's Open-source architecture is increasingly adapting to the corporate environment, and its frenzied momentum is unstoppable. Its new and unique data management program is helping companies transform the way large data is stored, processed, analyzed, and shared.

The evolving Hadoop technology

With its technical advantages, Hadoop has won many awards, but at the same time, it is plagued by misinformation and excessive promises that contradict what is truly a technical capability. Putting forward unrealistic expectations or misleading technical perceptions when deploying Hadoop technology will lead to waste of time, rising costs and lackluster performance.

Understand the technical capabilities and limitations of Hadoop and develop an installation plan that will give full play to the Hadoop technology capabilities in the future. Understanding the truth about Hadoop technology and avoiding the following common pitfalls will help you deploy Hadoop successfully:

Myth Number one: Hadoop can replace Data Warehouse

Truth: The Hadoop framework itself is not a complete data or analysis solution, nor is it a framework or platform to be used or substituted for a data warehouse. For its part, relying on Hadoop technology to develop cost-effective large data platform solutions, and other databases to share information, making it a perfect combination of data warehousing. Relying on Hadoop technology, enterprises will be able to make full use of the various types of massive data in new ways.

Myth Two: Hadoop technology short-lived

Truth: Hadoop is favored and its momentum seems unstoppable, so it won't be a flash in the pan. The Forrester Wave: the Large data Hadoop solution (2014 first quarter edition) reports that the Hadoop framework is a necessary data platform for large enterprises and is the most important component of any future flexible data management platform. To take full advantage of the technical advantages of Hadoop, the next-generation data Warehouse will be more deeply integrated with Hadoop technology, managing larger, more complex datasets.

Myth Three: Hadoop technology is free

Truth: Hadoop is indeed a set of Open-source products that all users can download for free. But using the technology is not free and even requires a higher cost. The efficient use of Hadoop technology requires highly trained professionals, while long-term data storage costs are high. Considering analysis and multi-user factors, the cost of Hadoop technology is actually higher than the data warehouse. In addition to open source technology, vendors also sell proprietary applications that support a variety of features, supporting and expanding the scope of Hadoop usage to provide more help to the enterprise.

Myth Four: Hadoop solution is a data integration tool

Truth: Hadoop is actually a distributed file system designed for specific data types and loads. But the technology lacks data integration capabilities. If the Hadoop solution is not used in conjunction with a large data management ecosystem, it will become another island of data that separates information from each other. Once the Hadoop technology is deployed in the Data Warehouse environment, users can query the data warehouse and the information in Hadoop.

Myth Five: Hadoop is a single open source product

Truth: Hadoop is a product library and technology library, including Hadoop Distributed file Systems, MapReduce, Pig, Hive, Falcon, Knox, and so on. Several vendors develop Hadoop products and add features that have differentiated advantages. For example, the Hortonworks data platform helps businesses collect, process, and share data in any format, any size. Not all Hadoop products are open source. The Forrester report says demand for Hadoop products allows manufacturers to face a cutthroat market, and they need to seize every opportunity to sell their own Hadoop solutions.

Unleash the full potential of Hadoop technology

Hadoop provides a reliable solution for large data set storage and processing, helping enterprises to overcome the difficulties of high cost of data use and complex data structure, and efficiently utilize all kinds of massive data. Although Hadoop is widely used and has many advantages, it cannot replace data warehousing or data consolidation tools. By consolidating with other data or analysis solutions, you can improve the value of Hadoop technology.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.