Big data in the United States has grown to a full swing. Government departments, it enterprises, retail, traditional industries, such as healthcare, and the Internet and software and hardware companies are showing everyone what big data can bring--even though these are considered "early stages" in the US--the big data age has shaken all aspects of American society, from business technology to health care, government , education, economics, humanities and other areas of society.
Because of the importance of the underlying and erupting background, the current scientific and academic community has even predicted that big data, as a technology and concept itself, is likely to avoid being the victim of Silicon Valley's notorious "technological maturity curve". This curve has shown that after a new technology is born, after the news media and academic conference hype, the trend will fall to the bottom, many startups become precarious until the development of a certain stage and then erupt again-this curve cloud computing has been and is experiencing, but the big data is very likely to "escape".
One reason is that in the current United States, the concept of "big data" is much more than a large number of data (TB) and the technology to deal with large amounts of data, or so-called "4 V" simple concepts, but it covers what people can do on the basis of large-scale data, These things are not achievable on the basis of small-scale data.
So change will be unavoidable. The degree of mastery of large data can be transformed into the source of economic value.
Victor Maire Schoenberg, the author of the Big Data age, came to China late last month to highlight to a number of industry and media people the basis and the results of their judgments about the value changes that might result from the big data age. In response to a reporter's question, Victor said that the form of the business company we are discussing now is the pattern of the previous big data age, so finding patterns from existing patterns is problematic in the big data age. We need to use new thinking to measure everything, including the new business model, the relationship between enterprises and social, government, business and other relations.
The scientific and social value of large data is embodied here.
Changed politics with the government
A slightly exaggerated but largely true case of the media is that Obama's successful re-election this year to the US president, behind which the dozens of-member data analysis and excavation team is critical.
The team already existed and played a role in the Obama campaign in 2008. This time, they have spent more than 5 times times the size of the last person, and have done more extensive and in-depth data mining. It helps Obama play a role in securing effective voters, advertising and fundraising. It turns out that Mr Obama has raised nearly twice times as much money from the average citizen as his rival Mitt Romney raised. According to a survey, the Obama team raised the first 100 million dollars, 98% from small donations less than 250 dollars, and the Romney team to raise the same amount of money, this proportion is only 31%.
In the words of campaign team spokesman Ben LaBolt: The Obama team has "nuclear code"-data is the most fundamental advantage that can defeat Mitt Romney. More "extreme" is the argument that the reason for Obama's victory is not in the economy, foreign policy or women's issues, but to win the big data!
Of course, this argument is far-fetched, but it can still be seen as American politics or the growing influence of science and technology by the politicians, according to Xu Zixuan, author of the coming data revolution, who told a recent salon that many politicians in the United States now attach great importance to social networks, and is hoping to benefit from data mining and data analysis. But the data innovation to the citizen, the government, the society brings the various challenges and the change, already deeply rooted in the people.
But social media analysis is only part of the big data "tip of the iceberg".
In the current recognized category, in the value chain of large data, the data itself, the skill and the thinking three aspects are the core competition domain, the social media analysis can be regarded as the data analysis Skill level subdivision domain, is also the traditional data mining category new variant.
The United States, which has a leading position in the data field, has made great strides in these three levels, where changes at the government level have been so obvious that they have even raised the value of data to a national strategic level-the Obama administration announced in March 2012 the "Big Data research and development plan." "The White House statement publicly said:" By improving our ability to extract knowledge and ideas from large and complex digital data sets, we promise to help accelerate the pace of science and engineering, strengthen national security and change teaching research. ”
In this plan, 6 federal departments, such as the National Science Foundation, the National Institutes of Health, the Department of Energy, the Ministry of Defence, the Ministry of Defense Advanced Research Program, and the Geological Survey, announced that a 200 million-dollar investment plan would be launched to improve the access, organization and collection of discovery information from a large number of digital data To learn more about the ongoing federal government's plans to address the opportunities and challenges of large data, and to plan to work with industry, the university research community, non-profit organizations and managers to leverage the opportunities created by large data.
Which, the National Institutes of Health, launched by the international thousand-person genome project, which will create a data set of human genetic variability studies for free access and use by researchers; the National Science Foundation and the National Institutes of Health will conduct joint tenders for large data to improve core scientific and technological tools, Improve the ability to extract important information from a wide range of large data sets and to manage, analyze and visualize it effectively; The Pentagon plans to invest around $250 million a year in a series of research projects in various military departments aimed at using massive data in innovative ways, through the combination of perception, cognition and decision support, Strengthen the decision-making power of large data; The United States Department of Energy will spend 25 million of dollars to establish an extensible data Management and Visualization Institute (SDAV) to help scientists manage data effectively, promote their biological and environmental research programs, the U.S. nuclear data program and other research results ...
As a result of Obama's commitment to open government, the 400,000 federal government's original data set has been fully opened since 2009 Data.gov. Data.gov announced the adoption of a new "open source government platform" to manage data, the code will be open to national developers. From this point of view, the large data has become the National innovation strategy, national Security strategy, national ICT industry Development Strategy and national Information Network security strategy of the intersection and core areas.
Of course, from now on, it is a bit exaggerated to say that big data changes American politics or government, but on the other level, the U.S. government's Open Data Service transformation is at the forefront of the world.
Changed industries and Industries
The large data value itself is recognized and mined, based on a premise--data. We cannot equate data with digitization, which is simply the conversion of analog data into binary code to facilitate computer storage and analysis, while the former is the process of translating the phenomena of daily life, production and commerce into quantifiable forms of tabulation analysis.
It is this process that has shaped the power of change in all walks of life--because it is a new capability unique to the big Data age: an unprecedented way to gain valuable products and services, or insights, by analyzing massive amounts of data.
Sun Huihui, director of CAS's Computing Institute, said, "large data in the future will likely become a new industry, and the big data itself is also beyond the Internet industry, not only in the network, the biological gene itself is also large data, the genetic data of each species will produce a lot of academic value, business value. "There is no basis for such a statement.
From the cases that have taken place in the U.S. market, the Internet industry, business intelligence and advisory services, retail industry benefited the most, but medical, health, transportation, logistics and even biotechnology, astronomy and other fields, have begun to "recognize" the value of large data. In fact, in various industries and applications in the United States, large data applications have sprung up.
Internet industry, Yahoo in early 2008 began to enable large data technology, daily analysis of more than 200PB of data, making Yahoo services more user-friendly, closer to users and customers. It collaborates with all aspects of Yahoo IT systems, including search, advertising, user experience, and fraud discovery; to gain a deeper understanding of each user, Amazon not only obtains information from each user's purchase behavior, but also records every user's behavior on their site, The effective analysis of these data makes it possible for Amazon to have a full understanding of customers ' purchasing behavior and preferences, and it has a great benefit in the category of goods, inventory, warehousing, logistics, and advertising business.
Health-care applications are also exploding-jobs is supporting cancer treatment with big data, by using smartphone apps to monitor patients ' vibrations, even the Danish Cancer Society, which uses large numbers to study whether cell phone use is carcinogenic, has companies like Microsoft to analyze the patient's occupancy rate. The most famous case comes from Google, a few weeks before the swine flu outbreak in 2009, when engineers at the internet giant Google published a compelling paper in the journal Nature. It shocked public health officials and computer scientists, saying that, like the CDC, Google was able to tell where the flu spread, and that their judgment was so timely that it would not be able to do so after a two-week flu outbreak. Google found that it was able to compare the 5 0 Most frequently retrieved entries with the U.S. CDC's data from 2003 to 2008 during seasonal flu transmission, after identifying whether the flu was infected by people searching online. After a mathematical model was processed, their predictions correlated with official data by as much as 97%. So, when the swine flu outbreak of 2009, Google became a more effective and timely indicator than the habitual lag of official data. Officials at public health institutions have received very valuable data.
I have to mention the retail industry. In fact, giants such as Wal-Mart and Tesco, the UK's retail giant, have gained huge benefits from the data and have thus consolidated their industry's longevity. For example, Tesco, which has become a typical case of "teenage pregnancy", the world's second-largest retailer has a full understanding of what a user is a "category" of guests, such as fast food, single, family with a school child, etc. And based on these classifications to carry out a series of business activities, such as mail or letter sent to the user's promotion can become very personalized, store shelves and promotions can also be based on the surrounding people's preferences, consumption time to more targeted, so as to improve the flow of goods. This is a lucrative reward for Tesco, which can help Tesco save 350 million of pounds a year just by promoting one market.
In addition, in the energy industry, SaaS software companies Opower use data to improve energy efficiency in consumer electricity, and have achieved significant success--opower with a number of power companies to analyze the cost of household electricity in the United States and compare it with the surrounding neighborhood electricity, Families who are being serviced receive a comparative report each month, showing that their electricity is at the same level as the entire region or similar family in the United States to encourage savings in electricity consumption. Opower's services are reported to have covered millions of households in the United States and are expected to save 500 million dollars a year for U.S. consumer electricity.
The most important thing to mention is the bio-information industry. Bio-information is the fastest-growing industry after the internet industry, and will far exceed the data produced by the Internet: Humans created the virtual world with 0 and 1, and the creator created the creatures with a/c/t/g four elements, and the mysteries of life, development, and extinction were all in it. With the development of sequencing technology, genome-wide sequencing prices were reduced to $ thousands of today from hundreds of billions of dollars 10 years ago, making it possible for more people and species to obtain DNA information. The acquisition of individual whole genome information makes the individualized diagnosis and treatment service possible. In the big data age, everything is possible, and all of this change is being experienced.
The value and thought of being reshaped
In fact, information change has been happening since the end of 20th century, but it has been focused on technology, and the big data age has let us begin to focus on the information itself.
The data has always been labeled as "accurate," but Victor Maire Schoenberg that "obsession with accuracy is the product of a lack of information and a simulated age, with only 5% of the data being framed and applicable to traditional databases." If we do not accept ambiguity, then 95% of the non-frame data can not be exploited, only to accept the inaccuracy, we can open a never set foot in the window of the world. ”
In other words, a simple algorithm for large data in a full sample era is more efficient than a complex algorithm with small data. Google's translation system is well received, but it does not need to translate 3 million sentences as accurately as IBM had invested heavily in Candide systems, but rather by the uneven quality of tens of billions of pages of documents that are translated in different languages--it regards language as the data that can be judged, not the language itself. This example means that we no longer need to worry about the negative impact of a particular data point on the entire analysis, but rather accept and benefit from the plethora of data, rather than eliminate all uncertainties at a high cost.
The scientific and social value of large data is embodied here. On the one hand, the degree of mastery of large data can be transformed into the source of economic value. But one problem is that, on the other hand, the big data age has shaken every aspect of the world, from business technology to health care, government, education, economics, humanities, and other areas of society--the simplest, Amazon can help us to recommend the books we want, Google can sort the associated sites, Facebook knows our preferences, and LinkedIn can guess who we know. Of course, the same technique can also be used to diagnose diseases, recommend treatments, and even identify potential criminals.
A better analogy is: "If the 20th century is an era of oil king, 21st century is a data-king era, the value of 21st century data may be equivalent to the oil of 20th century." "It is noteworthy that the current Internet-oriented technology and service capabilities of large data processing and mining is still far from enough, the future will have more valuable data from the vast number of large data unearthed, resulting in a lot of new business forms, new businesses and new services."
But the power of change in big data is more than that--the core proposition is that large numbers provide only a reference answer, not the final answer. Because it gives up the desire for causation and only focuses on the relationship-just know what it is, without knowing why-it completely reverses the practice of ancient times, so our understanding of reality and the basis of our decisions will be fundamentally challenged. In this case, the big data will be the same as the invention of the Internet, which is not only a revolution in the field of information technology, but also a sharp weapon to start transparent government, accelerate the innovation of industry enterprises and lead social change.
From this perspective, innovation in thinking patterns and management changes are inevitable, and data-driven businesses and governments are becoming possible.
A panorama of large data outbreaks in the United States, a major change in life, work and thinking is taking place.