is open source Hadoop really cheap? To figure out your IT costs

Source: Internet
Author: User
Keywords nbsp cost very Data Warehouse

&http://www.aliyun.com/zixun/aggregation/37954.html ">nbsp; Speaking at the TDWI summit of the 2014 Data Warehousing Institute in the United States, Richard Winter, a consultant with rich experience in data lifecycle management, pointed out that when using an open-source Hadoop architecture, it was important to calculate the cost of the data. Because many hidden costs lurk in the seemingly free architecture, they are often overlooked. Hardware costs are only a small part of the cost.

"The cost of many Hadoop does not come from the system itself, such as the cost of developing and managing the system," Winter says. ”

Winter points out that the development of applications for Hadoop clusters and the exploitation of peripheral toolset are still the most important in Hadoop development. In general, Hadoop is relatively inexpensive in all data architectures.

But winter suggests that data managers should look at specific application types when measuring Hadoop availability.

Calculate IT costs

Hadoop is based on Java, and winter recommends balancing storage, management, analysis, development, and system costs for how Hadoop is measured. In his study, he also cited general data, such as the salary of a typical Java developer from a website that tracked pay, and he had to add a 50% general cost to his staff. Winter also lists more information on his website.

Winter also considers the cost of developing queries in Hadoop, which only a high level of developers can handle. At the same time, he compares the number and cost of the code needed to make simple and complex queries in a data warehouse and Hadoop environment. He found that creating queries in a Hadoop environment was much more complicated, and that Hadoop file systems, MapReduce, Java, and SQL alternatives (such as hive) needed more code, which was a problem for businesses.

"Hadoop is a very broad application in only a small number of companies because they have a very strong Java team," says Winter. "In most companies, Hadoop is a limited application.

Make good use of technical advantages

At the summit site, Winter interviewed a number of attendees asking about the cost of the Data Warehouse project and the Hadoop project, and different users gave a very different answer.

If all costs are taken into account, it is much more expensive to recreate an enterprise-class data warehouse system with Hadoop than a traditional sql-based data warehouse. But if you need data-processing systems or data-pool-style applications that support data analysis, then Hadoop has a cost advantage, though it costs a lot.

Winter points out that Hadoop can monitor outliers in large amounts of data, even if only minor changes can be found by staff. This is an important application in the field of IoT. Taking the airline's engine data analysis as an example, only when the data deviates from the outliers can the data be paid attention.

Many factors, including use cases, can affect the choice of technology types. For example, when the system has more data sources, more users, and needs more queries, the time-tested data warehousing technology can demonstrate superior technical advantages. But if the reverse is the case, you might want to choose Hadoop.

In a further sense, Hadoop and traditional data warehouses are more likely to be integrated. Data managers need to do more than just pick the right platform for the application, but also understand the different technologies and use them separately.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.