Super AI: The future of Big data?

Source: Internet
Author: User
Keywords Large data we algorithms artificial intelligence

In Baidu large data opening conference, engaged in computer academic theory of the bosom into Peng President's speech as a sap to all listeners, Huai Principal's academic speech to make everyone foggy, all the people dizzy, the scene can understand is absolutely a minority, may feel the headmaster a bit like an alien in that selfish speech. But as a person who once wanted to engage in artificial intelligence research but missed a computer graduate, but the more the more excited, seems to find the future of artificial intelligence can reach the possibility, then I will now try to convert Professor Huai's speech to human also can understand language.

One, understand big data

1, the four characteristics of the current large data: Large scale, fast change, kind of miscellaneous, low value density.

In fact, it's easy to understand, we look at the big data on Sina Weibo, why it is so difficult to know, Sina Weibo has a large number of users large data, but the use of these behavioral data is struggling, because the data generated on the microblog is not vertical enough, involving a wide range, and business-related values are harder to dig.

2, industrial achievements

Huai principal cited three pieces of content, Baidu and Google familiar with user browsing behavior, and thus provide personalized search. Taobao Amazon is familiar with user shopping habits, can provide users with accurate preferences. Weibo and Twitter understand the user's thinking habits and social cognition, and can provide a series of data such as the public sentiment for the country and the enterprise.

Second, the change of thinking in practice

Big data gives us a shift in thinking about research and practical strategies.

1, from sampling to full sample, large data determines the characteristics of the all-inclusive, in the traditional industry to teach us to do statistics the largest way is sampling, such as system sampling, stratified sampling, quota sampling ..., these statistical methods will be more and more in the big data age disappear. Large data information can be counted to all the data to be counted, the industrial era of statistical methods eliminated.

We will use technology to get all the data we want to count.

2, from accurate to imprecise. In fact, this is also very good understanding, we take the traditional era of search, in the traditional search era, when we go to query a certain information, we need to get all the data, but the search engine has completely changed our understanding, the search engine provides only the first few items, These items fully meet our information needs.

Search engine is actually doing a set of fuzzy algorithm, after a series of algorithms to calculate, the best results brought to the user, and this result of the presentation also subvert the traditional understanding of the definition of the goal, in the Big Data era, we are not pursuing the absolute goal, It is a vague and imprecise unknown target deduced from the macroscopic trend.

We will pursue infinite approximations rather than absolute correctness.

3, from causation to association. And this directly led to the West has produced a startling speech-"theory is Dead", this is following the "Death of God", "Death of Man", "the author has died" "The End of History", "philosophy has died" after another bold speech. The decision makers of the past have to decide something, you have to refer to various theories to determine the cause and effect, but the big data age makes it easier to make decisions, such as large supermarket data may tell you in clear graphs that when it rains, the Super City cake will sell more, At this point, policymakers do not need to know any theory, any cause and effect, just need to prepare the cake in anticipation of rain tomorrow.

And this reliance on relevance is no longer dependent on causal decision-making thinking, is slowly penetrating into the large data of all walks of life, the internet industry, retailing, tourism, finance ...

Three, large data to large data computation

1, large data expansion, how to solve the search problem? The traditional algorithm in the search for data is completely no problem, because the amount of data is very small, but when the massive increase in data, the problem will be highlighted, the original algorithm to calculate must not be, according to the current fastest hard disk retrieval speed (60GPS), Linear scan of 1PB (10TB of 15) of the data needs 1.9 days, so when the data mass expansion, we must reconstruct the algorithm strategy to do data processing. Baidu's current processing capacity is a day to deal with 10PB of Web page data, which includes the operation and reading, is the best algorithm.

2, large data expansion, how to deal with algorithms and data problems? The above mentioned is that by changing the algorithm to achieve the purpose of traversing the data, but in the real processing of data is still unable to achieve efficient, after all, the machine CPU operation bottleneck is placed there, algorithmic engineers essentially doing is in the existing operating conditions, the design of the optimal solution to obtain the best results.

And the president told us that the challenge is, after the expansion of large data, not only to replace the original algorithm as an approximate algorithm, but also to change the data to approximate data, only the combination of the two changes can be in the existing machine computing ability to reach the optimal results.

It is also easier said than done, in such an approximate algorithm and the approximate data changes, in the end of the approximate degree, to the closest to the original algorithm results? You know, in the computer world, horseshoes, the amount of change may be very small, but if you change the wrong, it will cause huge error results, a little understanding of the program, people know that a few lines of code can make the computer no matter how strong CPU operation, and search engine is a larger trial and error engineering.

Finally, the President presented two academic frontier development, first, is to define the problem of easy to solve, from the practical application to find this kind of easy search problem, classify it and apply it in other practice. Second, large data is processed in small numbers to find the precision metric of transformation, which is the approximate value of the search data he said before.

After writing this paragraph can not help feeling, in fact, algorithmic engineers in the machine is not enough CPU conditions to meet the needs of the public game, the machine configuration conditions can never keep up with human needs, and in order to meet human needs, algorithmic engineers must rack their brains to design under existing conditions can produce the best answer, Instead of thinking about standard answers. And that reminds me of the dark blue computer that beat the chess masters, in fact, as long as the people who understand some of the program even I can write a set of can beat any chess master algorithm, but to finish a game of chess may have to run out of chess player life time, because the CPU speed can not keep up with the idea. So the success of the Deep blue computer is not an artificial intelligence win, but an engineer winning the strategy of designing an optimal algorithm.

In addition, in the large data calculation, the president also talked about the large data operation of the three major bases, representation, measurement and understanding. Because it is too professional to explain every word is enough to explain it in an article, and not necessarily clear, so skip.

Four, large data software engineering

As a person born of software engineering, when I read the president of the big Data software after the speech there is a sense of sadness, because I foresee that I have learned the software engineering will be large data restructuring, the vast majority of software may go to large data software, which is like the rise of the Web page, The number of web software will be much larger than PC software, and as the smartphone emerges, the number of app apps starts to go far beyond web software, and the future of software development will be the world of big data when the hardware matures. Tracing the origins of software engineering, this is in the computer hardware conditions stabilized, in order to solve the problem of efficiency in terms of engineering from the point of view of the software development approach, clear division of labor, schedule clear, and other industrial production indistinguishable. But the president's next speech, we can see that software engineering is likely to move to another model.

1, how to solve the problem of large data calculation support? The simple point is that large data processing is not a single or a few servers can be handled small things, large data processing needs a huge hardware support, hardware support is also necessarily distributed design, then how to design the top level of the system to meet the high efficiency of large data processing work? How are 3I features of approximations (inexact), increments (incremental), and generalizations (inductive) met?

How to work with distributed hardware and software in large data, how to avoid expansion loss, deal with failure and lose control of energy dissipation are all big problems. Challenging in system design.

2, public packet large data can develop software? This is actually a very crazy idea, this is how I read the idea of the president, assuming that we can do large data software development, then the situation should be as follows: Large data crawling machine according to read Sina Weibo data, Baidu Index data, Baidu Bar data, Taobao trading data ... After discovering the user's various emotions as well as the needs of the curve, software developers then based on the presentation of these data to develop a software model and then to the operator placed in the cloud, and the user to participate in the various cloud generated by the software, which produced a variety of behavior, so the machine based on the behavior of these users, for software modeling, planning.

This is a highly precise interactive data mining technology, provided that the solution and storage problems, all possible. The future of large data software will not be an inherent form, but a constant based on the data automatically changes in the super ecology, may not rely on product managers to promote, but rely on algorithmic engineers to promote, so that the user's needs naturally exposed, and then for them to implement some functions.

And I look at a higher philosophical level of such large data software constructs, if we think of the collective behavior of the entire human race as the data that is constantly working, and then some of us get to know something and then produce a variety of products, and come back to this big data software architecture, in the final analysis, such large data software, In fact, it is more like to restore our world, but it will be faster and more perfect than people do.

If such large data-software constructs were to be realized, then a certain definition of large data would be completely overturned, and the idea that large data would be a human-assisted decision tool for fixed human-information behavior would lapse. Large data may be defined at some point in the future: The true restoration of the human world, and the constant gratification of any of our desires, once we relied upon it to make decisions about something, and now we rely on it to get directly to what we want to do, and all our actions have become part of our decisions.

This is actually super artificial intelligence.

Conclusion: The President's speech in the beginning part of the big data in this era of our traditional thinking impact, some of the values are built, but the next more is not the results that have been formed, but more of the question mark, the problems that have not been solved, the attempts to fail, and some assumptions that have not been tried, On the surface it seems to be a little bit of a dissenting tune to the subject of the speech, but think about it and understand that the CPU's computational power never reaches the heights that these top researchers want, and that computer scientists will never be able to do the best they can under the conditions that are available in this era. , and this is the mission that they pursue in their life.

Thank the President of the wonderful speech, let us see the future of sporadic sparks, very beautiful.

Original link: Http://www.huxiu.com/article/32717/1.html?f=wangzhan

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.