The age of large data is profoundly affecting biomedical research: massive data needs to be shared and analyzed among different systems and institutions, but the lack of a unified standard has led researchers to get out of the way;
In the face of the flood of data, how to make better use of the information technology and biomedical fields to face the common challenges.
Big Data Age
In 2012, the United States government launched the large Data research and development initiative, which aims to use a large number of complex data sets to acquire knowledge and enhance insight, with an amount of $200 million trillion.
The so-called large data, or the vast amount of data, refers to the magnitude of the data involved to the extent that it is not possible to capture, manage, process and organize information in a reasonable time to help make decisions more actively, through the current mainstream software tools.
Not long ago, in the "Information Technology and future medicine" as the theme of the second "with the master peer" Academic exchange activities, internationally renowned scholars from Yale University, MIT and Harvard University's Bolaud Institute, Lawrence Berkeley National Laboratory, and China Academy of Engineering, the impact of large data on biomedicine, In the large data age, the problems of the standardization of biomedical research and the lack of complex talents are discussed.
Wei Yu, academician of the Chinese Academy of Engineering, said: "Biomedicine is entering the age of large data, many studies are large data research, large data storage, from the large data mining new information." ”
For instance, for example, a doctor may need to call a patient's genetic data, a large number of medical records, etc.
Recently, Wang Jian, director of the Shenzhen Huada Genomics Institute, has said that large data and great science are the core points of future bio-economic development. "In order to solve the problem of life science, we need to interpret it from the space-time state, which requires large data." This big data reveals the big science, which gives rise to big industries. ”
In the case of the Shenzhen national gene pool, the sample size has reached 1.3 million, including 1.15 million human samples, 150,000 samples of animal, plant and microbe. By the end of 2013, 10 million traceable biological samples are expected to be stored and 30 million biological samples will be stored by the end of 2015.
And this is just the tip of the iceberg of exploding big data.
Standardization dilemma surface
It is difficult for many scientists to realize standardized data sharing and analysis between different systems and scientific institutions.
Wu Huihua, director of the Center for Physical Informatics and Computer Biology at Delaware University in the United States, said the above problems are the key problems in the combination of biomedical and information science. Access to massive amounts of data is now becoming more convenient, but there are many differences between one institution and another, and a common standard is needed to centralize this information.
Take an example of a hospital with the most urgent need for large data. Rubin Rubin, director of genome Science at Lawrence Berkeley National Laboratory, said ideally the goal is to establish a unified electronic medical record system, which should have a unified standard, but the reality is not the case, the various hospitals stored data standards are different, and different systems stored information is not the same.
According to Wu Huihua observation, currently in the United States and other countries, different institutions and databases produce and store data are to comply with different standards, standardization issues in the industry has not yet reached a consensus.
For the difficulty of standardization, Rubin explains that the large amount of data is not the key, but the diversity of data types leads to difficulties in unifying standards.
He said that, for example, genetic sequencing, although the amount of data is large, but the same type, it is easier to analyze under the same criteria, and biomedical data is much more difficult, involving blood pressure, heartbeat and other types of clinical and digital information, some data difficult to correlate, which created a standardization challenge. At present, countries have begun to pay attention to this issue, information science and biomedical scholars need closer cooperation.
In Wu Huihua's view, Chinese scientists should actively join in the discussion, design and formulation of international standards, and participate in the worldwide biomedical information sharing.
Multi-talents, Feng Mao squamous angle
Although the standardization is difficult, but the participants in the industry generally believe that the urgent task is to solve the biomedical and information science combined with the lack of complex talents. Because the combination of the two processes of standardization and a series of problems to resolve, the need for researchers to two areas have deep attainments.
According to the experts, at present, few colleges and universities have set up the interdisciplinary and academic departments of Biomedicine and Information science, and the compound talents across these two fields mostly originate from the scholar's spontaneous or under the guidance of the tutor.
Lin Haifan, director of Stem cell research at Yale Medical School, was impressed by one of his students. The student had volunteered to focus on biological information, when many teachers thought he was doing his work. Finally, he chose to repair information science, and now is a rare talent both in biomedicine and information science.
"I found that some students, although the choice of biological professional, but in fact, very mathematical talent, our Institute of Information, the Director of the Department is this training." "Linhai Sail said.
Wu Huihua is the typical of this type of compound talent. She also has a background in biology and computer science education, a Bachelor of Science degree from Taiwan University, a master's degree in plant pathology from Purdue University, and a second master's degree (computer science) at the University of Texas at Tyler.
To promote multidisciplinary research and education, she founded the Centre for Bioinformatics and Computational Biology (CBCB) at the University of Delaware in 2009, comprising more than 60 teachers from 5 colleges, and created or was responsible for a number of bioinformatics education projects.
The United States Government is promoting interdisciplinary education in computer science and biology, and at the level of the National Science Center, promoting students from the high school stage to begin to learn interdisciplinary knowledge, Messirove Bolaud, deputy director of MIT and Harvard University, and chief information Officer, Mesirov.
This may enlighten China.
(Responsible editor: Fumingli)