Using DNA to store data is not an impossible task. We've introduced Harvard's work on DNA storage, and they've stored 700TB of data in just 1 grams of DNA. Research on DNA storage has also been evolving, and not long ago, researchers at the European Molecular Biology Laboratory (EMBL) created a new method of DNA storage that, in this way, overcame the problem of error-prone DNA storage and stored data for up to hundreds of years. The new method, published on January 23 in the journal Nature, says that storing high-resolution video in this way for more than 100 million hours requires a small glass of DNA.
Today, the world's digital information is too large, about 3Zb (equivalent to 3*10^23 byte), but also the constant influx of new digital data, which is a difficult problem for the storage. Large-capacity hard drives are expensive and require continued power supply, which is the best "no" archival material, such as tape, that degrades over a few decades. This has also become a growing problem in the field of life sciences, as a large amount of data (including DNA sequences) is also an important part of scientific records.
"We all know that DNA is a very stable medium for storing information because we can extract DNA from mammoth bones tens of thousands of years ago, so that it can be stored for a very long time," he said. Nick Goldman, of the European Molecular Biology Laboratory, explains, "and it's very small, dense and very high, and it doesn't need any power support to store it, so it's easy to transport and save," Nick Gaoman said. ”
Nick Gaoman with the artificial DNA he synthesized. Pictures from EMBL
The task of reading DNA is simple, but how to write data accurately is still a major obstacle to the realization of DNA storage. At present, researchers face two major difficulties: first, using the current method can only produce a small fragment of DNA, and second, the reading and writing of DNA are very error-prone, especially when a large number of the same characters are encoded into the DNA. and Nick Gaoman and colleague Ivan Bernie (Ewan Birney) came up with a way to overcome the problem.
"We can only use shorter DNA sequences to encode, but this encoding produces a lot of the same string." So we thought, simply separating the encodings into several overlapping fragments in two directions, each with an indexed message, that shows each fragment as part of the entire code, so that a coding method that does not allow repetition is designed. In this way, it is very rare for data to fail to read only if the same error occurs on four fragments. "Ivan Bernie said.
New methods need to synthesize DNA from coded messages, and California State's Agilent technology company provides researchers with synthetic equipment. Ivan-Henik sent a DNA-encoded data to Agilent, including a MP3 document from Martin Luther King's "I Have a Dream" speech, a JPG photo from the European Molecular Biology Laboratory, a PDF of a groundbreaking paper, "The molecular structure of nucleic acids," A txt file of Shakespeare's 14 lines and a coded description file.
"We downloaded the files from the Internet and then synthesized hundreds of thousands of pieces of DNA, and the resultant stuff was just like a speck of dust," says Dr. Emily, of Agilent. She sent the sample back to the lab, where the researchers sorted the DNA and then decoded it.
"We create a highly fault-tolerant coding pattern in molecular form (i.e. DNA), and this storage mode can be stored for 10,000 years or more in the right conditions," said Nick, "so long as someone knows what the code is, and then there is a machine that can read the DNA, they can know what it is. ”
Although there are many practical problems to be solved, the high density and durability of DNA makes it a very attractive storage medium. Future researchers will further refine the coding scheme, explore practical problems, and pave the way for the commercialization of DNA storage.
(Responsible editor: The good of the Legacy)