This task is to data mining of three groups of children's intestinal macro genome sequencing sequence. I am responsible for the 2-3 years Old,control, 10 double-ended sequencing data.
I'm a split line *********** not in the mood to sell Moe ********************************************* today *****************
STEP1:
New folder:20161205
Using the human genome as a reference sequence, the comparison is first performed to see the data quality.
Bowtie2-build/home/pxy7896/desktop/20161205/gcf_000001405.35_grch38.p9_genomic.fna Human
STEP2:
View the quality of your data. Is it similar to the human genome?
Here are only two groups for example, the data quality is better, so there is no need to remove the human genome interference.
STEP3:
Describe species classification
Forf in *.fastq.gz
Do
Metaphlan2.py$f--input_type FASTQ--nproc 4 > ${f%.fastq.gz}_profile.txt
Done
Merge tables
/home/pxy7896/downloads/metaphlan2/utils/merge_metaphlan_tables.py*_profile.txt > Merged_abundance_table.txt
To draw a heat map:
/home/pxy7896/downloads/metaphlan2/utils/metaphlan_hclust_heatmap.py-c bbcry--top--minv 0.1-s log--in merged_ Abundance_table.txt--out Result/abundance_heatmap.png
Feeling should put pe-1 and pe-2 together, and the gap between pe-1 and pe-2 is small. So try to merge the situation:
ids= "g45084g45072 G45071 G45109 G45125 G45124 G45049 G45054 G45121 G45099"
Fors in ${ids}
Do
metaphlan2.py${s}_pe_1.fastq.gz,${s}_pe_2.fastq.gz--bowtie2outresult1/${s}.bowtie2.bz2--nproc 5--input_type FASTQ >result1/profiled_${s}.txt
Done
To view CPU conditions:
Then merge the tables:
/home/pxy7896/downloads/metaphlan2/utils/merge_metaphlan_tables.pyprofiled_*.txt > Merged_abundance_table.txt
Redraw the Heat map:
/home/pxy7896/downloads/metaphlan2/utils/metaphlan_hclust_heatmap.py-c bbcry--top--minv 0.1-s log--in merged_ Abundance_table.txt--out Abundance_heatmap.png
Modify the command to show all kinds, not top25, and modify the precision to 0.01
/home/pxy7896/downloads/metaphlan2/utils/metaphlan_hclust_heatmap.py-c bbcry--minv 0.01-s log--in merged_abundance _table.txt--outabundance_heatmap_2.png
Ps:
1. View the memory usage commands separately:free-m; To view memory and CPU usage commands:top, and then enter 1
The htop tool can also be installed,sudoapt-get install Htop
After installation, enter the command directly:htop
2. Remote Connection
Http://www.linuxidc.com/Linux/2016-06/132442.htm
Refer to the above to set up Ubuntu , record the IP address of Ubuntu (ifconfig)
Then use the software RealVNCunder win7 , enter the IP and password.
https://www.realvnc.com/download/viewer/
Bioinformatics Exercise 1-integrated use of software