Brief introduction
The SRA database is a major archive of high-throughput sequencing data from the National Institutes of Health (NIH) and is part of the international Nucleotide sequence database collaboration (INSDC), which includes the NCBI sequential read archive (SRA), the European Bioinformatics Institute (EBI) and the DNA Database Japan (DDBJ). Data that is submitted to any one of the three organizations is shared.
The SRA database data comes from a high-throughput sequencing platform (Roche 454 gssystem®,illumina genomeanalyzer®,applied Biosystems solidsystem®,helicosheliscope®, completegenomics® and Pacific biosciencessmrt®) original sequencing data and comparative information, stored sequencing data can be reused among research groups, and new discoveries are realized by comparing datasets.
A typical next-generation sequencing workflow
SRA database and NCBI other databases
NCBI has developed and maintained more than 35 biometric data classification databases, including six major categories of scientific literature, health, genomics, genetics, proteins and chemicals.
Each database has its own minimum publishing unit. For example, the PubMed is an article, whereas in SRA, the smallest unit of release is an experiment (logged in as srx#). The SRA experiment consists of sequence data and how the biological sample is sequenced (meta data).
The SRA database interacts with other databases
All NCBI databases are interconnected. This correlation allows for powerful search capabilities. For example:
Find articles referencing SRA research in PubMed: "PubMed SRA" [Filter]
Find a SRA experiment published in PubMed: "SRA PubMed" [Filter]
Similarly, you can find a SRA connection to another NCBI database, and vice versa.
Click on SRA to find more examples
SRA data
SRA accepts data from a variety of sequencing projects, including clinically important studies involving human subjects or their genomes, which may contain human sequences. These data are typically controlled access through DBGAP (genotype and phenotype databases).
SRA data Download
1.SRA Toolkit Tool Download
Download installation
2. Download the data
First go to NCBI to search for and find the SRA address of the data you want, then write the script to bulk download.
$ while read line; Do wget $line; Done<sra_ftp.txt
Then unzip the *.sra file
$ for I in $ (ls *.sra);d o echo $i; Fastq-dump–split-3 $i; Done
View Fastq File
"References"
Sra
NCBI SRA Database