The big data age has come. In fiscal 2014, the US government launched the big data to knowledge program on how to make full use of biomedical large numbers, a new round of basic research on bio-data, following the implementation of the national Big Data program in 2012. At present, the developed countries in the field of biological large data technology and applications have been far ahead of the front. In our country, large biological data is still in the early stage of development. How to catch up with this trend as fast as possible, how to effectively protect and manage biological large data from the level of national sovereignty, how to synchronize with the world in basic research and technology market application, has become an unavoidable and worthy of deep thinking.
Who is the leader in bio-data technology?
The core motivation for large data development comes from the desire to record, measure and analyze the world. At present, the rapid development of high throughput sequencing technology makes life science research obtain powerful data generating ability.
Professor Wang Yadong, Dean of the School of Computer Science and technology at Harbin University of Technology, told the ScienceDaily Reporter that in the 90 's, scientists spent 10 years and nearly 3 billion dollars to get the first human genome map; Today, the completion of a human genome sequencing less than a day, costs less than 1000 dollars.
Since the completion of the Human Genome Project, with the United States as the representative, the world's major developed countries have launched a life science basic research projects, such as the international Human Genome Project, DNA Encyclopedia program, the United Kingdom 100,000 people Genome Project. These plans lead to the explosion of biological data, the current annual global production of biological data has reached the EB-level, life science is a data revolution in the field, life sciences to some extent, has become a large data science.
"This is just the beginning," Wang Yadong stressed, "with the wide application of sequencing technology in medical, health, medicine, environment, energy and other related fields, human beings will face the ocean of biological data, which will become the source of innovation in these fields, management and application of these data, Will bring a new revolution to the field of life science and related industries. ”
The development and application of China has been opened up in comparison with the booming global bio-data innovation boom. "We are at least at the 30 level with the International frontier Technology, the gap is mainly in data analysis, data management and clinical application docking," The Shanghai Institute of Biotechnology Research, director of the Li Yicho researcher on this deeply worried.
Li Yicho analysis that China has four major aspects are very deficient: first, the domestic existing biological large data analysis capacity, although not with the United States and Europe, but in the data analysis framework, software systems and advanced it technology to be upgraded. Second, abroad in the field of bio-large data leading talent, although we also have the International top publications published papers and results, overall, the domestic high level team or less. Third, the European and American emphasis on the application of results, an endless stream of analysis software can be laboratory, clinical, industrial multiple applications. Four, in the biological large data theory research, the standard formulation and the widespread application, China all urgently needs to follow up comprehensively.
Who will lead the market and resources for bio-large data applications
In the effective management and utilization of large biological data, the developed countries began to compete early. As early as the 80-90, the United States, Europe and Japan have established the world's three major biological data centers: the United States National Biotechnology Information Center (NCBI), the European Institute of Bioinformatics (EBI) and the Japanese DNA Database (DDBJ).
Wang Yadong stressed, "These three biological data centers master and manage the world's biological data and knowledge resources, and is in a monopolistic position." ”
The National Institutes of Health (NIH) has established 8 national bio-data technology research centers, which aim to develop long-term bio-data analysis technologies, improve the ability to use and transform biological data, and maintain a leading position. Wang Yadong further points out that the United States government twice in the past two years launched a large data research program, the purpose is to focus on biological large data management, analysis, sharing and other biological areas of the urgent need for the core technology, fundamentally improve the use of large biological data in the United States, and to promote the biological field of research and industry development.
In the commercial field, the application of large bio-data market is also springing up, at present, some companies have begun to provide bio-large data services. Google, for example, invests in DNAnexus, provides bio-data management and analytics services, and takes over NCBI data in 2011; As early as 2006, 23AndMe started providing personal genomic data Analysis Services, with a total of more than 500,000 beneficiaries. The British Ministry of Health established a gel company in 2013 to manage and analyse genomic data from the 100,000-person genome project in the UK.
The BCC report states: "By 2018, the total size of bio-data market will increase to 7.6 billion U.S. dollars, the annual composite growth rate of 71.6%." "If the health care industry in the United States uses large data effectively, it can reduce costs by about 8% per cent, creating more than $300 billion trillion a year," the McKinsey report said. ”
Who will take control of our country's biological large data sovereignty
For a country, large data in important fields has become a strategic resource, and the ability to have the data and to use it will become an important symbol of a country's overall national strength.
Our country population occupies the world first, the biological sample resource is rich, this will make our country soon becomes the biological data output big country, but at present is not the biological big data utilization power.
In fact, the international biological data resources have been in Europe and the United States in several major data centers. Many biological data resources produced in our country have to be submitted to these data centers, which leads to the serious loss of biological data produced by our country's investment and manpower.
In the field of large biological data, China lacks the system, mechanism and environment for effective management and utilization of large biological data from the national level, Li Yicho said, "This has made China's biological digital sovereignty seriously threatened".
Wang Yadong also stressed, "now the international three major biological data centers are established in Europe and the United States, and free to the international open." The relevant scientific research and market application development in our country benefit from these data centers, and are also heavily dependent and subject to this. ”
Industry insiders pointed out that China has not yet set up for the development of large biological data technology research Center, technical research and development lack of macro planning and guidance, less technical output, it is difficult to establish a perfect bio-large data technology system, can not meet the biological large data development faced with data management and service needs. At the same time, besides Harbin Institute of Technology and Shanghai Biological Information Technology Center, such as a small number of universities and research institutes set up a large biological data professional research team, the talent gap is larger.
The use of large data has become a key element in the development of productivity, innovation and competitiveness in all areas of the country.
Experts noted that large biological data is the national strategic resources, the management and utilization of large biological data resources should be increased to the national will, and consider the implementation of the following measures: the establishment of national large biological data center to ensure the digital sovereignty of China, the overall management and rational use of national biological large data strategic resources; To break through the core technology of bio-large data, to form independent key technology and system products, breaking the technical restrictions of the United States and Europe, based on the existing advantages of academic and technical resources, the establishment of national bio-large data research institutions to upgrade our large bio-data technology and service levels, and to train professional biological Emphasizing the application of demand traction and policy support in order to accelerate the overall development of large biological data industry.