A new platform for large data analysis technology

Last Update:2014-12-09 Source: Internet

Author: User

Keywords Large data analysis large data analysis can large data analysis can new platform large data analysis can new platform put large data analysis can new platform put then

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In Gartner's development cycle (hype cycle), big data is soaring, and a new category of large data service providers has been created. The news is the most metascale to the unknown. The company entered the public eye this April and is a wholly-owned subsidiary of Sears Holdings.

Located in Hoffmannstedt City, Illinois, Metascale is a large management data service provider operating in a cloud-based model. That is, Metascale can provide varying degrees of sufficient support to customers who are prepared to adopt large data analysis techniques without the relevant architecture or capabilities.

In this article, the journalist editors interviewed Metascale's founder and CEO, as well as Sears's CTO Phil Shelley, on the challenges of big data and market trends.

Journalist: What are the big data challenges facing companies today? Can you elaborate on large data management and large data analysis separately?

Shelley: First of all, from the perspective of Big data management, we are now at a new threshold. As any senior in IT industry knows, Holy Grail wants to bring all the data to the same place, which is very demanding on the system. Of course, they failed to achieve their wishes, the result is to use ETL replication data; This duplication is very large, and different systems are used for different purposes, and the data is placed in different places. As a result, data management has always been a headache. But now there is a change. It is now possible to put the data model in a single place where all the transaction information and history of the enterprise are in the same place. In this way, you can actually manage the enterprise, manufacture the model, design the data architecture, and improve the efficiency of the data use in real sense. The reuse of data is very important, and with these technologies, it can finally be realized.

Once you've got the data in one place, you have the new possibilities to use them, because Hadoop can hold a ton of history. It's not just saving, it can also be analyzed without moving the data. When your business involves a number of P-meter data, you really don't have the means to move them for analysis. The old fashioned method of using ETL to move data to an analysis platform is not working now. So, compared with the past, it is a great step to have a platform that can store data and analyze it.

Interviewer: So, that means you're bringing the tools to the data instead of moving the data to the tools?

Shelley: In all of today's big data technologies, there are also a number of emerging tools to configure the graphical front-end and analysis front-end, so that you can query and analyze in the data warehouse, rather than copying them, you just need to extract a small part of the data you really want, that is, the result set. This is a new subversive way of thinking, and it will take some time for people to adapt to it.

Journalist: I've heard a lot of statements--"logical data Warehouses", "mixed data ecosystems", and so on, all of which emphasize putting data in the right place. That's the same thing you said, right?

Shelley: Yes, but there are places where I'm more specific. Some people would say, put the data in the right place, and if so, you have too many systems that contain a lot of data fragments. I will not support this claim because of the time and cost of ETL. But I absolutely embrace the ecosystem of tools. If you need High speed SQL analysis, then Hadoop must not be appropriate, which is beyond doubt. How much data to place, when and how to put it-these issues need careful planning, otherwise it will produce some local data too much, some places too much space. In that case, you're back to the problem of using ETL-moving the data. It is critical to consider the enterprise data architecture in particular, and the need to properly integrate systems with Hadoop. But then again, I don't really believe in too many other operational data stores and logical data marts because that only increases complexity. As the data gets bigger, you can't do that and you don't need to.

(Responsible editor: The good of the Legacy)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More