NCBI SRA Database

Source: Internet
Author: User

Brief introduction

The SRA database is a major archive of high-throughput sequencing data from the National Institutes of Health (NIH) and is part of the international Nucleotide sequence database collaboration (INSDC), which includes the NCBI sequential read archive (SRA), the European Bioinformatics Institute (EBI) and the DNA Database Japan (DDBJ). Data that is submitted to any one of the three organizations is shared.

The SRA database data comes from a high-throughput sequencing platform (Roche 454 gssystem®,illumina genomeanalyzer®,applied Biosystems solidsystem®,helicosheliscope®, completegenomics® and Pacific biosciencessmrt®) original sequencing data and comparative information, stored sequencing data can be reused among research groups, and new discoveries are realized by comparing datasets.

A typical next-generation sequencing workflow

SRA database and NCBI other databases

NCBI has developed and maintained more than 35 biometric data classification databases, including six major categories of scientific literature, health, genomics, genetics, proteins and chemicals.

Each database has its own minimum publishing unit. For example, the PubMed is an article, whereas in SRA, the smallest unit of release is an experiment (logged in as srx#). The SRA experiment consists of sequence data and how the biological sample is sequenced (meta data).

The SRA database interacts with other databases

All NCBI databases are interconnected. This correlation allows for powerful search capabilities. For example:

Find articles referencing SRA research in PubMed: "PubMed SRA" [Filter]

Find a SRA experiment published in PubMed: "SRA PubMed" [Filter]

Similarly, you can find a SRA connection to another NCBI database, and vice versa.

Click on SRA to find more examples

SRA data

SRA accepts data from a variety of sequencing projects, including clinically important studies involving human subjects or their genomes, which may contain human sequences. These data are typically controlled access through DBGAP (genotype and phenotype databases).

SRA data Download

1.SRA Toolkit Tool Download

Download installation

2. Download the data

First go to NCBI to search for and find the SRA address of the data you want, then write the script to bulk download.

$ while read line; Do wget $line; Done<sra_ftp.txt

Then unzip the *.sra file

$ for I in $ (ls *.sra);d o echo $i; Fastq-dump–split-3 $i; Done

View Fastq File

"References"

Sra

NCBI SRA Database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.