Large data is not a bubble is the inevitable development of history

Source: Internet
Author: User
Keywords Big data can we now

November 7, 2013, sponsored by IT Business News Network, it time Weekly and the National Cio/cto Club, "Discovery" 2013 large data and mobile application Summit forum in Beijing New Century hotel Nikko three grand event.

The conference aims to discover the business value of large data applications by exploring IT solutions based on large data. Invited the national enterprises and Institutions CIO, well-known Internet Enterprise CTO, the national high-tech Park representative, Third-party market research agency representatives, collaborative office fields, app developer representatives, information experts and other industry celebrities to participate in the topic, and jointly explore the large data, mobile internet ecosystem construction.

Liu, general manager of SAS software China Company

Liu, general manager of SAS software China, delivered a speech entitled "Large data and large data analysis technology", which reads as follows:

Good afternoon everyone, thank you very much for the invitation of the organizer. Big Data and big data analysis technology today. Large data These two years is very hot, but his business applications, macro talk more, the specific technical aspects of very little talk, so today from a technical point of view to talk about large data topics.

Big data is not like what you hear, or what some people say is a bubble, in fact it is the inevitable historical development. You can look at the overall development of it technology, the first PC, have the basic software, then developed to the Internet, there are databases, ERP, management automation, all these technologies continue to develop, and finally can see the development result is to accumulate a lot of data. So to this stage in all aspects are very mature, now is at this stage.

In addition, some of our technology is also able to deal with a large number of data, this should be said that large data is real, not a slogan.

Earlier, we have a controversy about this, some people say that is the concept of Google, no matter who put out the concept, now big data is real, now the amount of data around the world about 1.8BP, involved in all aspects of the industry. When we used to learn political economy, we all knew that production factors include productive resources, manpower and capital, and now we can say that data and means of production equate value as a productive factor. He is an investment guru, but never invests in the technology industry, he invests in the traditional industries he knows more about, such as McDonald's and Coca-Cola, but recently he has invested 1 billion of dollars in technology. He said a word, be careful to understand the data technology of these geeks.

When I went to the U.S. for a meeting in the first half of this year, one person said that 20 companies in the United States were applying for big data, and that big data analysis was more mature in America. Last November 24, referring to IBM's 500 retail business data in the United States to do an analysis, we no matter what he went to analyze, but see such a demand, he can put so much data together for analysis, visible data sample size than in the past is much larger.

In one is the prism plan, we know more, here is a more detailed introduction is how to follow up some key points, according to these key points to do a number of different colors, indicating the level of alert signal. Doing this thing dot like the visual sense of the Facbook, we can see where the hot spots are. Visualization is very important in large data.

The exposure of the plan is bound to lead to competition in the data. You can see that data analysis is really valuable, including national security. The level of data analysis techniques and data analysts will, to some extent, determine the competitive advantage of countries and are a strategic issue for the country.

At the same time, we see this opportunity to create a lot of jobs as we do in the Internet or in the computer age.

When it comes to technology or technical level, one is the model of analysis, we have to do a very complex model, through the complex calculation to get some simulation results, so do some calculation analysis. Now the sample size is growing, more and more, although not a full sample, but the sample is significantly larger.

Analysis Speed: The software of the past analysis is really slow, to 10 million rows of data may run for several hours, now not so long, now to 1 billion rows of data can be analyzed in seconds, this is the result of the development of new technology. As the Times change, with the ability to process data technology, it can be said that large data is a relative concept. The analysis that can be done now is to format the comparison of data analysis, the unformatted text of the poor, not so accurate, very accurate. But our analysis of audio and video is still relatively weak in the initial stage. It's a big challenge to have more data samples in the future for us to analyze and how to mix them up for analysis.

The other is the legal provisions. For example, through the public data analysis of your privacy, national security issues, this is not illegal? There is also the issue of data ownership, many of which should be settled by legislation.

Data security: If the data is tampered with, the results of the analysis will certainly be biased, there are some problems. When the U.S. auxiliary Island nuclear power plant was built, it was found that there were no earthquakes, and in fact a few years ago it was known that there were earthquakes and no key data was added.

Logistics data analysis, is the data analysis of the Internet of things, the data collected after the real-time analysis results, can provide users with timely help.

Data relevancy: Now there are so many large data, the associated useful data is relatively small. Can another large data lead to a probability theory? Can big data help us solve some of the more scientific problems that have been difficult to solve in the past? We know, for example, that a planet with a solar system can be found by data.

10 Trends: Technically one is high-performance computing; visual analysis; combined with cloud computing, the future model is the data age your software is also a cloud, you use the cloud software and data best can be another place, so it is more convenient to use. Other including management science will be widely used, including changes in the way government forces and enterprises make decisions. Big data changes the way people think, and you can look at a book written by a professor in Cambridge. The big data about the impact on people's thinking. Another is the combination of business model and retrieval. We combine search engine and data analysis to find what we really need.

The other is the change of the army. The speed of future decision-making, the accuracy of decision-making needs to pass data analysis, need some software to achieve. In the future, it is the direction of the development of the army to realize the automatic judgment of quantification. The government can better control the trend, understand the public opinion and guide the people through information. Before we talked about big data, here's a big data technology.

What are the puzzles of traditional analytical techniques? In the past, the data conference has brought a lot of problems. Now the new big Data age, our analysis software, hardware platform and data have changed, so we are doing data analysis is not the same as in the past, in fact, the past business model is also outdated, we can think of the future should be a new business model, what is the business model? You can think about it. Data is not the structural data of the past, limited data, but a large number of, results, semi-structured, non-structural data, stored in different places, how to do data binding and distribution of these are to be considered.

Another is the software, to be able to analyze a large number of data, in addition to support such as memory calculation, support grid computing.

Data analysis develops evolutionary diagrams: From basic files to flash files, to current data analysis and cloud computing. From the very beginning of a single thread, and then the development of multithreaded grid computing.

To have large data analysis must have high-performance computing, high-performance analysis can support the development of this work.

Overall architecture pattern: we must be able to support these when we are doing big data analysis software. After entering high-performance analysis, you can look at how much speed increases, the past 1 billion rows of data analysis, according to the way the hardware can provide 10 hours to 20 hours, now 4 wonderful clock can be completed.

Analysis Model: When data is analyzed, the data is extracted from the library, analysis of the results of the display, now the data transmission is really affected by the network bandwidth, if the data is not taken out, directly to the analysis of the data into the library, as long as the launch, the library can analyze the data, this is much faster than the data transmitted.

In the past it took about 1 minutes to do a job, now we do 96 jobs, the 96 work to 48 to do, one to do, it takes about 96 minutes, now put them in different places to do, 96 different work done only need 2, 3 minutes. Therefore, the calculation rate of distributed computing is greatly improved.

When you open a Word document to put data out of the hard drive, the speed is slow. But in the memory of constantly changing word as a document, you do not feel the calculation of data, which is in the kernel and. When we combine grid distributed computing with kernel analysis to do data analysis, it will greatly improve the speed of data analysis.

Visualization tools can help you understand more complex data, and now you can extend the visualization of visual data can be a model, all the data mining, data analysis, and other solutions can be based on this is a core flat platform.

Now the visual analysis can be prepared for the data, we put the data from the hard drive all in, to reach memory at a very fast speed, the level of the second to calculate.

And this thing can be designed as a report, this report can be seen through the Web page around the world, and Support mobile.

Data analysis technology should also have data management technology, we have to combine data analysis and cloud computing, this is the future model. Now there are certain limitations to the technical approach. Future analytics software can be on the cloud, and this pattern should be developed in the future.

At a meeting in the United States, you put forward a concept about the data analysis of the version, 1.0, 2.0, 3.0 now we do a structured analysis, that is 1.0, large data we think is 2.0, in the future a variety of data mixed analysis can be seen as 3.0.

Last thought: We should seize this opportunity to use large data for business and services for all aspects. Thank you!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.