Big data goes from "concept" to "value", based on large data recommendations and forecasts gradually popular, data science will rise, security and privacy become an important issue, the large data industry into a strategic industry-this is the Chinese Computer Society large Data Expert committee on the "Big Data" 2014 Ten trends in the prediction of the content. In this forecast, it also includes data commercialization and data sharing alliance, large data ecological environment development and so on. At the same time, the Committee of large data experts predicts that in the 2014, the Internet and E-commerce, Finance (stock market forecasts, financial Analysis), health care (epidemiological monitoring and forecasting), biological information, pharmaceutical and other aspects will have a compelling application. In the past 2013, large data has been used in medical, financial, e-commerce and urban management.
"Big Data" from 2012 Preheating, to 2013 by all walks of life mentioned, a variety of public opinion sound mixed, some people think this is an opportunity, others think it will be a "bubble." What are the problems that big data will face in 2014?
Open data is still a big problem
The premise of data application is that the data is open, which is already a consensus. Chinese Academy of Engineering, director of the China Internet Association Hequan pointed out that China's population ranked first, but 2010 China's new data storage is 250PB, only Japan's 60% and North America's 7%. At present, some departments and institutions in our country have a large amount of data, but would rather they do not want to provide to the relevant departments to share, resulting in incomplete information or repeated investment. China's data storage capacity reached 64EB in 2012, with 55% of the data needed to be protected at a certain level, but less than half of the data is currently protected.
Last December 14, in the China Computer Society Young Scientist Forum (YOCSEF), the Institute of Geography and Resources of the Chinese Academy of Sciences researcher, Chinese Academy of Engineering, Mr. Sun Jiulin reviewed our scientific data opening and sharing process: 2003, The Ministry of Science and Technology, with the support of the Ministry of Finance, set up a special platform for the construction of Science and Technology Foundation, and the scientific data sharing project is included as an important part in the platform construction of Science and Technology Foundation; 2008, the Ministry of Science and Technology released 973 plan resources and environment data exchange Management method Into operation service Stage, 2011, the National Science and Technology Foundation Platform organization first batch of cognizance.
Sun Jiulin introduced the U.S. approach to data openness. The United States Government provides policy and funding guarantees, to make the data Information Center cluster become the national information production and service base, guarantee the continuous data supply, use the network to send the data and information to all the citizens ' desks and families, including scientists, government employees, company staff, school teachers and students in time. Bring the whole society into the information age.
"Let every citizen in the data, information, knowledge, theory, decision-making, effectiveness of each link to play the talent, so that the data flow in the process and the application of the various values of the full excavation, the state for their talent and value of the excavation to bring good road, service, and create a good environment. Sun Jiulin that this is the U.S. government's choice of data information sharing the "cycle" of the road. The basic point of this idea in the distribution of benefits is to benefit the whole society and the entire country.
At present, China does not have a national level of national laws specifically suited to data sharing, only relevant regulations, regulations, Articles of association, opinions and so on.
In view of the problem of data sharing in the front end of large data utilization, Sun Jiulin that more than 10 years of data sharing has achieved great results, especially the whole society shared ideas have been agreed, but the existing problems are still very prominent: the lack of national level policy, there are some scattered views are not binding enough, The understanding of the profound significance of the open sharing of data by senior managers needs to be improved; the existing national data sharing platform is difficult to meet the needs of the national development and scientific and technological innovation to the data resources; lack of data open-sharing dedicated teams and corresponding data professionals and management personnel Lack of reasonable evaluation mechanism and standards for full-time data sharing service personnel, etc.
Macro-planning of "national large data strategy" is urgently needed
"Do not be misled by big data, which is more about data mining than about the size of the big numbers," he said. "At the tenth session of the National Forum on Information Technology experts, Hequan pointed out that large data needs more emphasis on data mining and utilization, the key is to have a national large data strategy."
Hequan proposed that need to develop a national large data development strategy, large data is a strong application-driven service, its standards and industrial structure has not yet formed, this is the opportunity for China to leapfrog development, but should not herd in the case of unclear purpose of building large data centers everywhere, "data real estate", It is necessary to attach importance to the development and utilization of large data, and take it as an effective way to change the mode of economic growth. At the same time, China needs to enact "Information protection Law" and "Information Disclosure Law" as soon as possible, both to encourage the community-oriented data mining and to serve the society, to prevent the infringement of privacy, to promote data sharing and to prevent data abuse.
China Computer Society Expert Committee pointed out: the large data age two is very conducive to the development of China's information industry, the first is large data technology source-oriented, has not formed a technology monopoly; 2nd, China's population and economic size determine China's largest data asset scale in the world. Therefore, the Government, academia, industry and capital markets should work together to maximize the value of data assets and release large data on the premise of ensuring national data security.
At present, a number of enterprises have started to use data to start a business. In foreign countries already have a lot of data to provide services, do data analysis, visualization research companies, some have achieved good results, and even have a good prospect of refusal to buy large companies. Some people predict that if the domestic Internet entrepreneurs, from a huge amount of "garbage" information to sniff out some clues, to find a point of entry, may become the industry leader. But it is not easy to find a decent "big data" startup in the country, but there are those who think it is a gap that makes people see opportunities.
The shortage of large data talents in countries
Big Data talent is no doubt a shortage of talent. Gartner Consultancy predicts that large data will bring 4.4 million new IT jobs and thousands of non-it jobs worldwide. McKinsey predicts that the US will have a gap of 140,000 ~19 in depth data analysis by 2018, and that the number of technical and managerial personnel that can analyze the data to help the company gain economic benefits is 1.5 million. China's ability to understand and apply large numbers of innovative talent is a scarce resource.
IDC (Internet Data Center) released a forecast report that 2017 large data technology and services market will increase to 32.4 billion U.S. dollars, to achieve a 27% annual composite growth rate. It also predicts that decision-making solutions based on large data will begin to replace or influence the role of knowledge workers, which is bound to trigger a shift in talent.
How do countries develop data scientists and data engineers in the face of a shortage of large data professionals? "2013 China Big Data Technology and Industry development White Paper" is by the Chinese Computer Society large data expert committee over half a year, which specially combed the large data talent training.
In China, the Chinese University of Hong Kong has established a master of Science in Data Science business statistics since 2008, and Fudan University has opened a data science discussion class since 2007, and has been recruiting ph. D. in data Science in 2010, and has been offering data science courses since 2013. The University of Aeronautics and Astronautics established a master's degree in Data engineering in 2012.
In the United States, the University of California, Berkeley, opened "Introduction to Data Science" from 2011, the University of Illinois at Urbana-Champaign held the "Summer Institute of Data Science" since 2011; It is also planned to establish a master's degree from 2014, a doctorate in 2015 and a master's degree in "Data Science" from the fall of 2013. In the UK, the University of Dundee has established a master's degree in Science in data science since 2013.
Large data experts The Committee believes that, from the current national talent training, data scientists should master mathematics, statistics, data analysis, business analysis and natural language processing and other disciplines, with a wide range of knowledge, have the ability to acquire information independently. The curriculum of Fudan University emphasizes that data scientists are scientists who study data, not just a data engineer or data analyst.