Why do many companies have large data-related businesses based on Hadoop solutions?

Source: Internet
Author: User
Keywords Cost can very why

The most important reason to choose Hadoop is that three points: 1, can solve the problem, 2, low cost, 3, mature ecological circle.

One, Hadoop helped us solve the problem

Large companies, both domestic and foreign, have an insatiable appetite for data, and will do everything they can to collect data,

Because the asymmetry of information can be constantly realizable, and a large number of information can be obtained through data analysis.

The source of the data is very much, the format of the data is more and more complex, and the amount of data over time is getting bigger and larger.

As a result, traditional databases are quickly becoming bottlenecks in data storage and computing based on data.

And Hadoop was created to solve such problems. The bottom of the Distributed file system has a high scalability, through data redundancy to ensure that the data is not lost and submitted computational efficiency, while the data can be stored in various formats.

At the same time, it also supports a variety of computational frameworks, which can be computed off-line or on-line.

Second, why the cost can be controlled low

When you are sure that you can solve the problem that we have, we must consider the cost problem.

1, hardware cost

Hadoop is architected on inexpensive hardware servers and does not require very expensive hardware to support

2, Software cost

Open source products, free, based on open source agreement, free to modify, more controllable

3, development costs

Because of the two-time development, and because of the very active community discussions, the ability of developers to demand relatively low, the cost of learning engineers is not high

4, maintenance costs

When the cluster is very large, the cost of development and maintenance will be highlighted. But it's a lot cheaper compared to the self-research system.

A division of the same system since the hundreds of engineers nearly 4 years of investment, burning billions of dollars, have not yet replaced Hadoop.

5, other costs

If the security of the system, the community version of the frequent upgrade and reality is not synchronized to upgrade the other hidden costs introduced.

Third, what are the benefits of a mature biosphere?

The mature ecosystem represents the future direction of development, represents a good market prospects, represents a more Qiantu job (OK, "three representatives").

Look at the picture (quote: Hadoop ecosystem map? mynosql)

Partial system classification:

deployment, configuration, and monitoring Ambari,whirr

Monitoring management Tools Hue, Karmasphere, Eclipse plugin, cacti, ganglia

Data serialization processing and task scheduling Avro, zookeeper

Data collection Fuse,webdav, Chukwa, Flume, Scribe, Nutch

Data storage HDFS

Class SQL query Data Warehouse Hive

Streaming data processing Pig

Parallel computing Framework MapReduce, Tez

Data mining and machine learning Mahout

Column Storage online database HBase

Meta Data Center Hcatalog (can be used in conjunction with pig,hive, MapReduce, etc.)

Workflow Control Oozie,cascading

Data import Export to relational database Sqoop,flume, Hiho

Data visualization Drilldown,intellicus

There's a lot of companies to use.

(citation: A New Version of the Hadoop ecosystem Map)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.