In the history of mankind, no era has ever seen such a huge data explosion. As the internet was born, the wave of big data is now raging and has become a powerful tool to start transparent government, accelerate corporate innovation, and lead social change around the world.
Earlier this year, U.S. President Barack Obama announced a 200 million dollar investment in big data areas, and the U.S. government defined the data as "the new oil of the future".
The ability to quickly get valuable information from a variety of types of data is a big data technology. Mastering this technology has become a new competitive advantage, a new kind of economic assets. In business, it is like a sea of enterprises sailing a blue ocean, and it is not only the enterprise "patent", in the government, the application of large data is the key to building efficient service-oriented government.
"Statistics have developed very quickly in recent years and will be used more in all areas of society in the future, and the era of big data is coming." "Hu Shanqing told Fulcrum" reporter. Hu Shanqing, a visiting professor at George Washington University in the United States, served as Senior advisor to the U.S. Bureau of Statistics and the Ministry of Commerce from 2004 to 2012.
Hu Shanqing, who immigrated to the United States as a child, had a natural interest in statistics. After getting his PhD in mathematical statistics at George Washington University, he went to work in the U.S. Government department. He was appointed as the first national Ombudsman for the Department of Energy in 2000 and previously served as deputy director of the Federal Ministry of Agriculture's Civil Rights Division for the management of information technology and complaint matters. Today, Hu Shanqing, a prominent statistician, is also the chairman of the American Hundred People Research Committee.
As a fast-growing economic power, China's statistics have become increasingly valued and have great influence on the world. "I hope to have the opportunity to go to China to walk more and broaden my horizons." Interest is in the use of academic experience, to provide support for Community academia innovation opportunities. "China has a very broad market for large data applications," Hu Shanqing said.
The following is the dialogue between Fulcrum journalist and Dr. Hu Shanqing.
The end of traditional data statistical patterns
Fulcrum: What are the limitations of traditional statistical methods in the age of data explosion?
Hu Shanqing: In the last century, the measurement and inference of population and economy in different countries mainly adopted the traditional census and random sampling method, which is very important to the policy making and information transmission of each country.
But as far as the census is concerned, although it has proved its importance for many centuries, it does have some well-known practical weaknesses. Because human activity is continuous and dynamic, the census can only provide a more comprehensive speed map for a given census day or a short period, and more time is spent on data processing, analysis and reporting results. Usually when the census results are announced, they are obsolete.
The complexity of the census in China is unimaginable. The acquisition of sample data requires interviews with 1.5 million people in 31 provinces, 4,800 villages, 4,420 townships and 2,133 urban districts.
At the same time, most countries, even developed countries, face stringent budget constraints. The current high cost, low recovery surveys and survey methods negate the possibility of their new introduction or expansion of the usual practice. The problem has also been compounded by the decline in the response rate of global censuses and surveys. In the United States, for example, the 2010-year census participation rate is only 74% of 2000 years, despite many plans and efforts. At the point of personal interviews, the average cost of the census rose to 56 dollars per household, more than 100 times times the initial mailing costs.
In the era of data explosion, the real challenge for the NBS was daunting, and the 20th century statistical system did not meet the 21st century demand. Netizens using government statistics are rapidly increasing in numbers and breadth. They need broader, more dynamic, more timely data and easy access and understanding, but the resources and time required for existing methodologies are either unavailable or unaffordable.
Fulcrum: How has the statistical system changed in the 21st century compared to previous ones? What changes have the "big data" brought to government work and production?
Hu Shanqing: According to a study by the University of Southern California, the world's electronic storage volume surpassed the number of non-electronic stores for the first time in 2002. In 2007, at least 94% of all information on earth was stored electronically. Thus, the data can be completely electronically entered into machine processing and calculation without the need or consideration of sampling.
The rapid development of electronic storage has also brought about changes in statistical systems and methods in the 21st century, and the study of longitudinal data is possible. Longitudinal data is a repeated observation of the same unit (e.g., a worker, a student, a family, a business, a school or a hospital) over time. It can provide unique bottom line and change measurement at individual level.
Large data is a new term for very large amounts of electronic data, and it is probably not collected according to the structural and probabilistic principles of traditional statistical systems. Administrative records, social media, barcode and radio scanners, transport sensors, energy and environment monitors, online transactions, streaming images and satellite imagery are both large data sources and outbreaks of growth factors.
The private sector has taken the lead in producing large data, combining government statistics, developing data mining techniques and methods to identify potential consumers, expand markets, test new products, and extract new information for other market and customer research. In some cases, they can even challenge traditional government functions. For example, some social media search words have been used to target colds, and it does not perform as well as the public health authorities, and in a timely manner.
While government statistics are diminishing in the vast ocean of data, it still has the unique importance of supporting a globalized economic system and addressing expanding social needs. However, when we live in a few seconds to search the web for millions of results and the international stock market day and night to report the transaction data, it will take months or even years to collect, process and distribute static results that are limited by geography, business and population.
(Responsible editor: The good of the Legacy)