Combined with GATK and Samtools and Picardtools call SNP

Source: Internet
Author: User
Tags perl script

Just started to learn bioinformatics, the teacher gave a SNP as a marker to draw a genetic map of the subject, studied for some time, began to call SNP with Bwa+samtools, the elder sister used this set before, she suggested I use another method to do, so prepare to learn to use GATK to do SNP calling.

Call SNP first to have a more accurate reference genome, then there are samples, my sample is a hybrid produced F2, below make some of their own use of process and experience,

This process is different from BWA and Samtool, the first need to filter the fasta of sample, I found the following mainly with NGSQC Toolkit under the Perl script filter, the command is:

$ illuqc_prll.pl-pe READ1.FQ READS2.FQ 2 5-c 8-o QC

This is the command that added the environment variable.

Then use BOWTIE2 to index the reference genome:

$ bowtie2-build genome.fasta Genome

Genome is the index file prefix, this index file generation directory defaults to the directory where the current directory is located rather than the ref genome

Then mapping the sample to ref genome on the following command:

Bowtie2--rg-id Sample--rg "Pl:illumina"--rg "Sm:samplename"-X genome-1 illuqc_filtered_files/sampler1.fastq.gz_ Filtered-2 illuqc_filtered_files/sampler2.fastq.gz_filtered-p 8-s Sample.sam

--rg-id is the header file that sets the Read group ID to-RG plus "@RG" on the sample, similar to the-R parameter in BWA Mem

The next step is to convert the resulting Sam file into a binary BAM file, which is not available to Samtools, as follows:

$ samtools view-bs Sample.sam > Sample.bam

Samtools view is dedicated to handling BAM files and Sam files, the-bs command is the input file format for the SAM output file format for BAM.

The next step is to mark out a large number of duplicate fragments produced by PCR amplification:

This step was used in the Sortsam.jar of Picardhome.

The command is as follows:

$ java-xmx10g-jar Picardhome/sortsam.jar Input=sample.bam Output=sample.sorted.bam SORT_ORDER=coordinate

Sort

$ java-xmx10g-jar Picardhome/marduplicates.jar Input=sample.sorted.bam Output=sample.dd.bam METRICS_FILE= Sample.dd.metrics max_file_handles_for_read_ends_map=1000

Index to Genome.fasta again

$ samtools Faidx Genome.fasta

$ Java-xmx10g-jar Picardhome/createsequencedictionary.jar R=genome.fasta o=genome.dict

$ samtools Index Sample.dd.bam

Here is a need to note that the Genome.fasta prefix genome must be the same as the genome.dict prefix genome, no one next error

Next, compare the Indel (realignment) around the area:

Started gatk the software with a knife.

Java-xmx10g-jar genomeanalysistk.jar-r genome.fasta-t realignertargetcreator-i sample.dd.bam-o sample.realn.int Ervals

Java-xmx10g-jar $GATKHome/genomeanalysistk.jar \ r genome.fasta-t indelrealigner \-targetintervals sample.realn.inte rvals \ I sample.dd.bam-o sample.realn.bam

Next is the first call SNP:

$ java-xmx10g-jar $GATKHome/genomeanalysistk.jar \ r genome.fasta-t unifiedgenotyper \ I sample.realn.bam-o sample.g ATK.RAW1.VCF \--read_filter badcigar-glm BOTH \-stand_call_conf 30.0-stand_emit_conf 0

$ samtools mpileup-q 20-q 25-ug-t ad-t dp-f genome.fasta Sample.realn.bam | Bcftool Call-c-V indels-v-> SAMPLE.SAMTOOLS.RAW1.VCF


Combined with GATK and Samtools and Picardtools call SNP

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.