Open Big Data to learn the road of the long way to repair

Source: Internet
Author: User
Tags hortonworks

Analyzing big data markets with big data

Today, the technology of the Big Data revolution, which is red to purple, is Hadoop (note: A distributed system infrastructure). Hadoop is an ecosystem of a range of different technologies. There are a lot of companies that do Hadoop-related products, and there are a lot of different options and variants, such as Cloudera,hortonworks, Amazon Emr,storm and Spark are part of it. and Hadoop, as a whole, is still the most popular big data technology to talk about.

However, through our data analysis, it is found that only a few of the world's 500,000 companies have actually used Hadoop technology. Some would say that we are still in the most initial stage of the technology being accepted by the public. We assume that the practical use of Hadoop represents the development of the entire big data, and through data analysis we find some interesting market realities.

When we see these raw data, we find that big data markets have a lot of potential. But there are very few real users, and there are a lot of companies in this area, which means big data technology companies will be merged. In short, the big data market will slowly become more mature.

Status at a glance

We analyzed billions of published online information, including press releases, forum posts, job postings, tweets, patents and more. We use these large numbers of documents for machine learning to get some very accurate information about the technology adoption of large companies.

What trends do we want to understand through analysis? For example, by counting the skills of a company's employees, you can see what technology is being used by their company, which companies are looking for spark, and which companies are doing the tricks according to how many scientists recruit. If we put the focus on Hadoop, we can find a company or an organization who is not talking about Hadoop-related issues, whether there is a need for Hadoop positions to be recruited, who has gone to the local Hadoop interest group, and who is on the internet asking about Hadoop's technical issues. We've even used every microblog, blog and presentation on Hadoop.

Overall, we found that only 2,680 companies were using Hadoop to some extent, and 1636 of the technology adoption levels were very low in these companies, who just started experimenting with new technologies, participating in interest groups and technical meetings to learn big data or trying to do some introductory exploratory projects. The other 552 at a higher level, they have started to use Hadoop for smaller projects in the interior (a departmental project or the company itself is a start-up company). Only 492 are in the advanced technology state, these companies have a relatively large project to put into the product and have employees have some experience in Hadoop.

Big companies love Big Data more

We were surprised to find that large companies (more than 5000 people) started using big data technology much faster than small companies. The average person would probably guess that a small or relatively short-history company would prefer to adopt new technology. But in the case of big data, the reality is just the opposite. We have found that 300 of the largest companies have already invested in technology for Hadoop, compared with only 300 companies under 5000 who are Hadoop users. Considering that the total number of small and medium-sized companies is 10 times times that of large companies, this means that Hadoop's share in the big company market is 10 times times that of small and medium-sized companies.

Most companies that use Hadoop are themselves high-tech data-driven companies. But we don't know why small companies have been slow to catch up. Is it because they can't afford big data software support? Or is it because they can't afford a well-paid data scientist and engineer? Or do they not have much data at all?

The oil and medicine industry lags behind the financial industry

Oil and gas companies and pharmaceutical companies generally have a very large number of data sets, but our analysis shows that they don't have much to do with Hadoop. However, although the financial industry is not traditionally an industry that can quickly adopt new technologies, big data technologies are quickly used.

This may be because the financial sector has been affected by some early adopters, such as American Express. or because they leap directly from IBM mainframe computers to Hadoop, skipping over generations of technology changes directly in the middle. There are even startups (such as Paxata and Syncsort) that specialize in providing this technology upgrade.

Real-time analytics can't block

The footsteps of Hadoop

It is puzzling that some industries that need real-time analysis have adopted Hadoop technology faster. These industries include the retail industry, IT security, telecommunications, and insurance. This is confusing because Hadoop's first basic mapreduce (mapping-inductive) model uses batch processing, which is inefficient in real-time data analysis and processing. In order to solve this problem, there have been some real-time companies (such as Datatorrent, VOLTDB and splice machine) that deal with Hadoop in the market.

Future prospects

Even companies that are ready to enter Hadoop face a lack of talent. There are 16,000 jobs in the United States that require Hadoop experience in writing this article. If the Hadoop market matures, the industry needs to find a way to leverage those who have no experience with Hadoop technology. The number of people who know SQL is 100 times times more than Hadoop. Solutions like splice Machne, PRESTO,IBM Big Data, Oracle Big Data SQL, and so on, which provide a way to query big data with SQL, will be more attractive because of the number of people involved.

Even if the problem of talent can be solved, the technology itself still has a very expensive problem of practical and maintenance costs. While using a free, open source Hadoop system, you still need to find a very rare system administrator who has a very low bid. In addition, while there are more and more solutions for backup, recovery, and high usability, managing Hadoop systems is still much more complex than SQL databases.

Today's Hadoop market can be said to be relatively small, and not allow so many startups in the competition. Our analysis shows that companies that really pay for big data are concentrated in a small number of large companies, so the final winners should be those that have a foothold in the market. We can also see this from the recent hortonworks in the stock market (note: The stock price is $11 and the market value has evaporated by more than half since it was listed).

This will directly lead to the acquisition or merger of some hadoop companies. Finally, if the company can not only support the most basic mapreduce, but also the public cloud pricing model, transactions, pure memory processing, real-time analysis and SQL, and so on, then customers can no longer to have a lot of different disposable systems and worry. In the end, like the former relational database companies are replaced by application companies (such as Oracle), these companies directly provide some big data-driven solutions, can be directly applied in the Internet of things, customer relationship management, supply chain and even some industry-specific applications, such as logistics management and even financial fraud detection.

The road is long

As you can see, there is a lot of room for growth and change in the big data market. Our analysis shows that the following aspects can help make these growth a reality. First, Hadoop can move into more vertical and midsize companies, second, the number of people in Hadoop needs to grow, and then improve the analytics system to make it easier for more people who already know SQL to use Hadoop tools. Finally, after the merger of the company if it can be transformed into an application as the main product of the company should be a smile to the final winner.

Open Big Data to learn the road of the long way to repair

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.