A few years ago, lead a team of "information housekeeper". The idea is very simple, the information explosion, the huge amount of news, modern people overwhelmed. In the shortest time, the most efficient and most targeted to meet the needs of individual information services, this is the project to do. Smart you see it is actually a personal information push service, the media is mobile terminals (such as smartphones), the application is mobile interconnection, selling point is customized, accurate and personalized. Technology, it's complicated, it's simple to say, is anyone to see which newspapers, magazines, or what professional, field, or even see what to see after what do not want to see what all have a fixed set of routines, these people do not pay attention to the parties do not feel, but through a certain amount of data collection, excavation and analysis, will form a clear "Model" (also known as "roadmap", actually what is not important). The system then automatically captures the individual's attention based on the model and pushes it to each subscriber. Compared to the mobile newspaper, it is more narrow, compared to RSS subscriptions, it is very personalized.
Remember doing, consciously have a mission great, far-reaching, believe that it changes not only from the media form, will also subvert the traditional information production and dissemination system, of course, with the technical conditions of the time has not really done it, because of confusion and stop at "everything to quantify" data analysis, this is too difficult. Later, after reading a number of books and articles, a taste of foreign frontier thought, and saw the relevant pioneer case, this just dawned, what we think and do is "big data."
Big data, and another "cloud computing", have become the most popular group of concepts in the Internet and IT industry, and people are talking about them, and it looks as if everyone is in the presence of them. However, there are different differences or emphases on what is the big data and how to understand it. For example, some time ago read Shiji's big data: The coming data revolution, and how it changed government, business and our lives, although also called large data, but it is actually about information disclosure, data fairness and government management, social governance aspects of the topic. The book is a special introduction to American samples and experience. In addition, Sumeng, Linsen and Zhou, "personalization: The Future of Business", the book on the Internet Technology support personalized business services and related models from the concept to the application of the introduction. There are Rajaraman, Ullman, "Big data: The Internet large-scale data mining and distribution department." The same is a "big data", but the two authors are concerned with the mining of very large data. Its content includes distributed File system, similarity search, search engine technology, frequent itemsets mining, clustering algorithm, advertisement management and recommendation system. This is a typical technical assistant book. In a nutshell, these books give a more systematic and profound introduction to some part of "Big data", but there was no grand vision of the big picture--until later, Victor Maire Schoenberg and Kenneth Couqueil's work, "The Big Data Age: The big changes in life, work and thinking," .
This book, which comes from "the Prophet of the Big Data Age", Victor Maire Schoenberg's great contribution is to further clarify the basic concepts and characteristics of large data, which is helpful to many people who think big data is "big data". Xie, the former Yahoo China general manager and prominent it commentator, bluntly pointed out in his speech, "Big data concepts are confusing, coming or going into melee": There are several misconceptions about big data. First, only in terms of quantity, it is impossible to tell the difference between ordinary and large data by seeing the growth of the data. Large data is never equal to large data. Most of the existing equipment and technical methods can be processed by large data, not large data. Second, data mining, fine operation, precision advertising, personalized service, and promotion of these are not the major parts of the future business model of large data services. Third, from the background of industrial development and social progress, simply encouraging discussion of large data cannot explain its importance.
However, in the big Data age, Victor Maire Schoenberg clearly points out that "big data is not an exact concept". Initially, the concept was that the amount of information needed to be processed was too large to exceed the amount of memory that was normally available to process data, so engineers had to improve the tools for processing data. This has led to the emergence of new processing technologies, such as Google's MapReduce and open source Hadoop platform. These technologies make the amount of data that people can handle increase dramatically. More importantly, the data will no longer need to be neatly sorted by traditional database tables. At the same time, because internet companies can collect a great deal of valuable data and have a strong interest drive to exploit the data, internet companies are logically the leading practitioners of the latest technology. However, but Victor's "Big Data" is a "people can do on the basis of large-scale data," the reference, is "people gain new awareness, create new sources of value, or change the market, organizational structure, and the relationship between the Government and the civil method."
As an important participant in the development of the world's Internet and as the main standard-bearer of the "Big Data" wave, Victor also has a sobering understanding. As he said, "Big data, like any other technology, is bound to go through the notoriously technological maturity curve of Silicon Valley: After the hype of the news media and academic conferences, new technology trends have plummeted to the bottom and many data start-ups have become precarious". Because of this, he is extremely candid: "This book aims to truthfully express the meaning of large data, but not too hot to hold it." The real revolution, of course, is not the machine that analyzes the data, but the data itself and how we use it. ”
Yes, it has always tried to make people aware of the potential and trends of large data, and to maintain the necessary prudence, not exaggerated, deliberately modified, in my opinion, in the field of large data, the book is a pioneering, open Leekpai and the foundation of the fundamental.
First, in the mindset of thinking, Victor reminds people to prepare for the "three shifts": first, in the age of large data, can analyze more and even all the data, and no longer rely on random sampling; second, there is so much data that you can give up the precise allow for confounding; third, with data support, it's not necessary. Know why, that is, from causal relationship to the relevant relationship. The three major conclusions of the proposed, is a rock. It would mean a radical change in the way people understand and form society, and a sign that the legitimacy of certain disciplines will face the toughest torture in history--Victor's view that "sample = totality" In the full data model, and that the social sciences may be the most shaken subject. "This discipline used to rely heavily on sample analysis, research and questionnaires. When people are recorded in a normal state, there is no need to worry about the bias in doing research and questionnaires. Now, we can collect information that we couldn't collect in the past, whether it's through a mobile phone or a relationship through Twitter messages. More importantly, we are no longer relying on sample surveys. "Victor's point of view is not one, in fact, a similar argument was made in the Albert Laszlo Barabasi's outbreak, the latter even more starkly:" Human behavior 93% is predictable through the analysis of large data and power-law distributions.
In addition to change in thinking, the big data age has been triggered by "business change" and "Management Change". In both parts, Victor cites a number of cases to reinforce the argument that everything can be "quantified" (text can become data, orientation can become data, communication can become data, everything can become data); At present, large data applications are only the tip of the iceberg, most of which are hidden beneath the surface Data innovation includes reuse, reorganization, expansion, depreciation, waste gas and openness; In addition, large data determines the future competitiveness of enterprises, thus, data brokers and data scientists will emerge, depending on the rise. People have already seen this.
However, in the optimism, Victor calmly also felt the big data on the eve of the fragile and uneasy, including industrial ecological environment, data security privacy, information fair and open issues. So he warned the world to be wary of the ubiquitous "Third Eye" (another metaphor for "Big Brother") and the existence of a data dictator. Based on this, he proposed the "responsibility and freedom of information management" framework to deal with the arrival of the big data era, methods include: personal privacy protection, from individual licensing to data users to take responsibility; personal motivation v.s predictive analysis; Smash black box, big Data Programmer's rise; anti-data monopoly tycoon.
In this easy-to-read, large-data classic, Victor gives us a panoramic picture of the future of big data reshaping life, work, and way of thinking. It has a stake in everything, and it is bound to restructure the physical world and the perspective of the world we see. When we are exposed to the torrent of data, all things can be quantified, analyzed, predicted, without concern for causes, which marks the "information society" finally worthy of the title. Lenovo to the "information housekeeper", the project itself so I have to pay attention to data, let the data "speak", and to the large data era, this is not only the need for individual entrepreneurship, but also the entire social industry needs. Small, know micro-see, this is the future that is happening!
(Responsible editor: Schpeppen)