Compared with the traditional data, the resource characteristics of large data are especially prominent, which becomes the basis of developing large data. In the process of knowledge evolution, data is not only the foundation of information, knowledge and wisdom, but also runs through it. Entering the information age, the biggest problem is not lack of information, but the problem of Information Island, only to achieve large data sharing and standardized management, to solve this problem.
Once upon a time, data was a counting tool for humans to identify the environment, and the focus on its accuracy seemed limited to scientific research. Enter the information age, each of us have a clear sense of information related to the data everywhere. It can be said that we are constantly producing a variety of data at the same time, the data also greatly affect us.
Two data revolution fusion data and scientific research
There have been two revolutions in the course of data development. The first data revolution was the birth of modern science, and the integration of data and scientific research was realized, and the basic position of data in scientific research was established. It is one of the basic characteristics of modern science to give precise demands to the research process and result. In the data based research paradigm, the reliability and accuracy of data represent the accuracy of the study, and even empirical research based on data is used as a criterion for judging "science" and "pseudoscience".
With the development of science and technology, the form and connotation of data are changing and developing. In addition to observation data, experimental data, theoretical data, statistical data, analog data, drawings, tables, text are included in the ranks of the data, the formation of structured data and unstructured data diversification of data forms; the development of information technology has led to a shift from a lack of data to a "data-rich, theoretical-deficient" direction, The speed and scale of data production are rapidly developing, and the information contained in the data is far beyond its instrumental and evidence-based features, forming large data that can be mined from new knowledge. Compared with the statistical data, the large data emphasizes all the samples; compared with the accuracy of scientific data, large data allows for a certain range of inaccuracies; compared with the causal relationship of scientific paradigm, large data seek the laws of nature and society by relevance. Therefore, the large data triggered the second data revolution, it not only changes the scientific research paradigm, to achieve the quantification of social science research, but also to promote economic, social, military and other social fields to produce great changes.
Large data to promote quantitative research in social sciences
In scientific research, through remote sensing devices, sensors, computer data collection or simulation methods to obtain intensive data, through computer software processing, the resulting information/knowledge is stored in the computer, the scientists only in the background using data management and statistical methods of data processing, analysis, access to knowledge, The formation of intensive science based on large data has become the fourth scientific paradigm of data-driven science proposed by Gray. As the EPJ Data Science magazine points out, the digital-driven sciences of the 21st century have become a complement to the traditional hypothesis-driven scientific approach, which accompanies the transformation of scientific paradigm from reductionism (simplification) to complex system science.
Large data may lead to the revolution of social science research, and promote the depth of quantitative research. Large data break through the research boundaries of natural science and social sciences, realize the data's accessibility, and communicate the resources of different disciplines through data. Dr. Watts, of Columbia University, has found large data have played an extremely important role in the sociological study of extremely complex human behavior, with data from a large number of individuals or small organizations being recorded in the form of data, which provides extremely rich and reliable information for human behavior research, The researcher avoids the prejudice of cognition, the error of perception and the ambiguity of frame.
The impact of large data on economy, society and human daily life is not limited to the technical level, but also has a great impact on the management concept and the mode of operation. "Data-driven social management" is a new management model implemented in social management, whether the Government or the organization, data collection and analysis has become the basic requirements of grass-roots management, according to the results of data analysis to formulate policies and regulations, the social management after the punishment turned to advance preparedness, in health care, Homeland Security, It plays an important role in the construction of intelligent city, in preventing and combating terrorist activities, in social order and in governance of social corruption. The 20th century American policing model CompStat A successful example of using large data to manage social security and achieve good results. By using large data collected from various local sensors and searching for keywords via the internet, the disease Control department can predict and determine the outbreak of an epidemic in a certain area. The business intelligence has realized the challenge and the leap from the data to the knowledge, "The Decision Support system" is the data and the information as the main source, and so on.
Whether it is "data-driven social management" or "Decision Support system", data acquisition and data mining are of vital importance to the collection of data, in the background of the analysis, the establishment of models, the use of cloud computing and other computing means to formulate policies, laws and decision-making to provide technical support. Countries have become aware of the importance of large data as resources that are as important as energy. March 29, 2012, the White House Science and Technology Policy Office on behalf of the United States government released the large Data research and development plan, and set up a "large data advanced Steering Group" to bring the big data technology revolution opportunities and challenges to the national strategic level.
Great need for large data sharing and normative management
Compared with the traditional data, the resource characteristics of large data are especially prominent, which becomes the basis of developing large data. In the process of knowledge evolution, data is not only the foundation of information, knowledge and wisdom, but also runs through it. Entering the information age, the biggest problem is not lack of information, but the problem of Information Island, only to achieve large data sharing and standardized management, to solve this problem.
With the support of mobile networks, cloud computing and other technologies, the rapid development of large data, the innovation of data analysis technology, the second data revolution quietly occurred. Big data, like any new technology, has also created social risks, such as personal privacy, objectivity and accuracy of data, and the misuse of large data, all over the fields of scientific research, social management, health care, business intelligence, etc.
The difference between large data and other technologies lies in its virtual nature, its concealment and permeability are more prominent. This can have a negative impact on individuals, organizations, countries and even the world at large. Therefore, it is very important to carry on a deeper ethical and philosophical reflection.
(Author Unit: College of Humanities, Chinese Academy of Sciences)
(Responsible editor: The good of the Legacy)