Introduction of Nano-hole sequencing technology

Source: Internet
Author: User
Tags comparable

Introduction of Nano-hole sequencing technology
    • Nano hole sequencing
    • Fourth generation
    • Sequencing
The nano-sequencing is coming.

The technology of Nano-hole sequencing (also known as the fourth Generation sequencing technology) is a new generation of sequencing technology that has arisen in recent years. The current sequencing length can reach 150kb. This technology began in the 90 's, underwent three major technological innovations: one, single molecule DNA from the nano-hole through, second, the nano-pore of the enzyme for sequencing molecules in the control of the accuracy of the single nucleotide, three, single nucleotide sequencing accuracy control. At present, the widely accepted nano-hole sequencing platform in the market is Oxford Nanopore Technologies (ONT) Company's Minion Nano-hole sequencing instrument. It is characterized by single-molecule sequencing, sequencing read long (over 150kb), sequencing speed, sequencing data real-time monitoring, machine portability and so on. This review highlights the technical features and application areas of the Minion sequencing instrument.


I. Brief introduction of Minion sequencing technology

The core of the Minion Nano-hole sequencing instrument is a flow cell with 2,048 nano-holes, divided into 512 groups, controlled by a dedicated integrated circuit. The sequencing principle is shown in Figure 1a: First, the two-molecule DNA is connected to lead adaptor (blue), hairpin adaptor (red) and trailing adaptor (brown); When sequencing begins, the lead Adaptor leads the sequencing molecule into an enzyme-controlled nano-hole, and the lead adaptor is the template read (that is, the DNA molecule to be sequenced) through the nano-hole, the role of hairpin adaptor is the guarantee of DNA double-chain sequencing, and then complement Read (the complementary chain of the molecules to be sequenced) passes through the nano hole, and finally the trailing adaptor. In the above sequencing method, the template read and complement read sequentially through the nano-hole, using pairwise alignment, they are combined into a three-way read, and in another sequencing method, do not use hairpin adaptor, Only the sequencing template read, which eventually forms 1D read. The latter sequencing method has a higher throughput, but the accuracy of sequencing is less than 2D read. Each connector sequence (adaptor) varies with the current that is caused by the nano hole (figure 1c), which can be used to identify the base.

Ii. advantages of Minion compared to other NGS sequencing platforms

1, the detection of alkali-based modification

The Nano-hole sequencing technique can detect four kinds of cytosine (cytosine) base modification, respectively, 5-methycytosine,5-hydroxymethycytosine,5-formylcytosine and 5-carboxylcytosine. The detection accuracy rate is 92%-98%.

2, real-time sequencing monitoring

For clinical practice, it is important to obtain and analyze Dna/rna sequences in real time. For traditional NGS sequencing, this is very difficult to do. But for minion, it is relatively easy to achieve. This is not only because the Minion volume is small, easy to operate, and so on, because in the sequencing process of single molecules through the nano-hole, the current changes can be detected and identified, this design allows users in the sequencing process based on real-time results to make some judgments.

Real-time sequencing monitoring is an important application for Minion sequencing for specific target sequences (Figure 2): When the DNA fragment passes through the nano-hole, if the current change presents the same trend as the target sequence, the nano-hole is passed. If the DNA fragment and the target sequence present different current trends, they cannot pass through the nano-pores. In this way, the enrichment of the target sequence is achieved, which significantly reduces the sequencing time and is important for in-situ and instant diagnosis.

3. Read with longer measurements

With the Minion Sequencer, for 1D read can get 300kb long read, for 2D read can get 60kb long read. Using the long read generated by the Minion Sequencer, the researchers managed to fill the human reference genome with a long 50kb gap in the Xq24 chromosome. Multiple copies of the CT47 gene were present in the region, and the researchers used Minion's long read to determine that there was a high likelihood of 8 CT47 gene copies in the region (Figure 3).

4, the detection of structural variation

The characteristic of NGS short sequence makes the detection of structural variation often inaccurate. This problem is particularly severe in cancer detection because of the variety of structural variations that are rife in cancer tissue. The researchers found that the structural variation results obtained from the long read of the hundreds of copies of the Minion were more reliable than the million read results measured by the NGS platform.

5. RNA expression Analysis

For RNA expression analysis, the problem with short sequences measured by the NGS platform is that the sequence needs to be spliced to get the transcript. This is troubling for the study of variable shear. Because NGS sequencing does not normally produce enough information to differentiate different forms of variable shearing. The long read generated by the Minion Sequencer can be a better solution to this problem. The researchers used the Dscam1 gene of Drosophila as an example, there are 18,612 kinds of variable shear forms, using the Minion Sequencer can detect more than 7,000 kinds of variable shear patterns, and such results using NGS short sequence sequencing can not be obtained.

6. Development of bioinformatics supporting software

In recent years, with the development of biological information analysis method, the ratio of minion sequencing reads to the reference genome has increased from 66% to 92%. In this article, the application scenarios of various tools are described separately. The tool overview is shown in table 1.

1. Base Identification Tool

Metrichor is a software for base recognition based on hidden Markov model introduced by ONT corporation. Its use requires a network connection. Minion registered users need to obtain the developer account to obtain the source code of the software. At the beginning of 2016, Nanocall and Deepnano software were developed in two laboratories. Both of these software can be run locally without the need for a network connection. Nanocall based on the hidden Markov model, the base recognition of 1D read can be done locally, and the Deepnano based on the recurrent neural network framework can obtain more accurate base recognition than hidden Markov model.

2. Sequence Alignment Tool

The traditional NGS sequence comparison software can not meet the requirements of Minion sequence alignment. This is because the error rate of the Minion sequencing data is relatively high and the sequence is long, even if the adjustment parameters do not achieve good results. In this case, the software is suitable for minion sequencing data.

Marginalign is to improve the efficiency of comparison with the reference genome by better estimating the source of Minion sequencing reads sequencing errors. By assessing the variation detected, it was found that it significantly improved the accuracy of the alignment. Since Marginalign is optimized based on the comparison of the last or BWA mem, the final accuracy of the results depends on the initial alignment results.

Graphmap is another software used for minion sequencing data comparison. It uses a heuristic (heuristics) method to optimize the high error rate reads and the long reads. A study has shown that the sensitivity of graphmap is comparable to that of blast, and that it is comparable to the marginalign of reads sequencing error rates.

3. assembly Tools from scratch

Minion sequencing data is not suitable for assembly using the De Bruijn diagram method assembled by NGS data, there are two main reasons. First, the De Bruijn method relies on sequencing reads to split the K-mer sequence accurately, while the high error rate of Minion sequencing reads cannot guarantee this; Secondly, the structure of de Bruijn diagram is not suitable for long reads.

The long reads of minion sequencing data is more suitable for the Sanger sequencing period based on the overlap common (consensus) sequence assembly method. What is needed is the error correction of sequencing reads prior to assembly. The first team to assemble based on this principle uses the Minion data to assemble a complete e. Coli K-12 MG1655 genome, the sequence accuracy rate reached 99.5%. The process they use is called Nanocorrect, first using the Graph-based,greedy partial order Aligner method for error correction, and then using Celera Assembler to assemble the error-correcting reads, Finally, the assembly results are further improved using Nanopolish.

4. Single Nucleotide mutation detection tool

Reference allele bias is a phenomenon in which mutation detection tends to be less detectable. This phenomenon is especially serious in the case of high sequencing reads error rate.

The Margincaller module in Marginalign is a mutation detection software developed by the Research Institute for minion sequencing data. Margincaller uses Maximum-likelihood parameter estimation and multiple sequencing reads sequence alignment to detect single nucleotide mutation. When the computer simulates a sequencing error of 1%, the sequencing depth in 60x,margincaller detects SNV with 97% accuracy and completeness. In another study, the researchers used the Graphmap method to detect the hybrid variation of human genome, which could achieve a 96% accuracy rate. Using the computer simulation data, the GRAPHMAP can also detect the structural variation with high accuracy and high degree of completeness.

Nanopolish can also be used to detect variations. It uses the Event-level alignment algorithm. In this method, starting from the reference genome sequence, the similarity between the electrical signals generated by the reference genome sequence and the sequencing reads is evaluated sequentially, then the reference genome sequence is modified to generate a consensus read. Until consensus read is sufficiently similar to the electrical signal generated by the sequencing read, the consensus read is compared with the reference genome sequence to get a mutation. This method has approximately 80% accuracy in the study of Ebola virus.

The PORESEQ uses a similar algorithm to Nanopolish. It uses lower-depth sequencing data to obtain high-accuracy and high-integrity SNV detection. In one study, PORESEQ achieved a 99% accuracy and completeness of SNV detection at 16X sequencing depths, which significantly reduced sequencing depth compared to marginalign.

5. Sequencing of the total sequence (consensus sequencing) method

The Minion sequencing data is currently only 92% accurate. In the case of low depth sequencing, it is not possible to meet the requirements for SNV detection of similar monomer type (haplotype phasing) and human samples. The problem-solving approach mentioned in the article is Rolling Circle amplication, which is the principle of multiple amplification of a fragment, producing multiple copies on a single DNA molecule, so that the accuracy of the resulting total sequence sequencing can reach 97%.


Third, the current application field of Minion

1. Instant detection of the source of infection

The NGS sequencing method can be used to detect infection and other pathogens in the hospital environment, and the Minion sequencing method provides a new experience. Minion has the advantage of NGS in terms of the length of the sequencing, the convenience of portability, and the length of the test. It only takes 6 hours from the sample preparation to the discovery of pathogenic bacteria, and it takes only 4 minutes to locate the pathogen from the sample. In this paper, the Minion sequencing method plays an important role in the virus detection process, which is used to study the species of Minion sequencing instrument and describes in detail the Ebola virus outbreak in West Africa.

2, non-whole-times detection

Minion can play an important role in prenatal testing of fetal non-whole-body. With NGS platforms, it usually takes 1-3 weeks to get results. Using Minion sequencing method, the literature report only takes 4 hours.

3. Space Applications

In space flight, it is very difficult to discover bacteria and viruses. Most of the research is taking samples back to Earth for sequencing identification. Currently, NASA is preparing to use the Minion Sequencer to perform real-time sequencing of germs on the International Space Station.


Iv. Outlook

1, Promethion

To meet the researchers ' need for high-throughput sequencing, ONT has developed a benchtop nano-hole sequencing instrument-promethion. The promethion has 48 flow cells that can be run separately or in parallel. Each flow cell includes 3,000 channels (channel), producing 6Tb sequencing data per day.

2. Sequencing read accuracy

At present, the accuracy rate of minion sequencing instrument is about 92%. For the exploration of similar pathogenic bacteria and variable shear, such sequencing accuracy can meet the demand. But for clinical testing, the read accuracy rate usually needs to reach 99.99%. Therefore, the article mentions that ont companies need to be optimized for sequencing-related chemical reactions and base-based identification software.

In addition, the article mentions that there are non-random sequencing errors in the Minion sequencing method. For example, Minion can not deal with the sequencing of the oligomer longer than 6 nucleotides, while the lack of base modification detection of the internal training. If these two problems can be solved, the accuracy rate of sequencing of the common sequence (consensus) can reach more than 99.99%.

3. Sequencing Read length

At present, the Minion sequencing length reaches 150kb. In the coming period, it can be expected that the sequencing length can be improved greatly.

4. Direct RNA sequencing

Reverse transcription and PCR amplification can lead to the loss of many of the RNA's own information, so at present ont and some research institutes are experimenting with the use of nano-pore technology for direct RNA sequencing. Previous studies have laid the groundwork for a study showing that tRNA can be detected in single-channel and solid-state nanoporous (solid-state nanopore), and that the nano-pore can detect the base modification of DNA and tRNA.

5. Single molecule protein sequencing

At present, mass spectrometry (mass spectrometry) is a better technique for protein group analysis, but for sensitivity, accuracy and resolution, the current technology has limitations. A 2013 study reported that enzyme-mediated proteins pass through a single-channel nano-pore. This study suggests that the sequence characteristics of proteins can be detected. These findings have laid a good foundation for the sequencing of protein nano-pores.

V. References

The Oxford Nanopore minion:delivery of Nanopore sequencing to the genomics community

Introduction of Nano-hole sequencing technology

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.