Cole Trapnell said:
There is three strategies:
1) Merge Bams and assemble in a single run of cufflinks
2) Assemble each BAM and cuffcompare them to get a COMBINED.GTF
3) Assemble each BAM and cuffmerge them to get a MERGED.GTF
All three options work a little differently depending on whether I ' re also trying to integrate reference transcripts fro M UCSC or another annotation source.
#1 is quite different from #2 and #3, so I-LL discuss its pros and cons first. The advantage is simplicity of workflow. It's one cufflinks run, so no need to worry on the details of the other programs. As TURNERSD mentions, you might also think the maximizes of the accuracy, and that resulting is the CA SE, but it also might not (for technical reasons that I don ' t want to get into right now). The disadvantage of this approach is a your computer might not being powerful enough to run it. More data and more isoforms means substantially more memory and running time. I haven ' t actually tried this on something like the human body map, but I would be very impressed and surprised if Cufflin KS can deal with all of the on a machine owned by mere mortals.
#2 and #3 is very similar-both is designed to gracefully merge full-length and partial transcript assemblies without E Ver merging transfrags that disagree on splicing structure. Consider Transfrags, A and B, each with A couple exons. If A and B overlap, and they don ' t disagree on splicing structure, we can (and according to cufflinks ' Assembly philosophy , we should) merge them. The difference between Cuffcompare and Cuffmerge are that Cuffcompare would only merge them if A is ' contained ' in B, or VIC E versa. That's, only if one of the transfrags is essentially redundant. Otherwise, they both get included. Cuffmerge on the other hand, would merge them if they overlap, and agree on splicing, and is in the same orientiation. As TURNERSD noted, this was done by converting the transfrags into SAM alignments and running cufflinks on them.
The other thing, that distinguishes these, the options are how they deal with a reference annotation. Can read on our website how the cufflinks Reference Annotation Based Transcript Assembler (RABT) works. Cuffcompare doesn ' t do no RABT assembly, it just includes the reference annotation in the COMBINED.GTF and discards parti Al Transfrags that is contained and compatible with the reference. Cuffmerge actually runs RABT when you provide a reference, and this happens during the step where Transfrags is converted into SAM alignments and assembled. We do the improve quantification accuracy and reduce errors downstream. I should also say that Cuffmerge runs Cuffcompare in order annotate the merged assembly with certain helpful features for Use later on.
So we recommend #3 for a number of reasons, because it was the closest in spirit to #1 while still being reasonably fast. For reasons this I don ' t want to get in here (pretty arcane details about the cufflinks assembler) I also feel that Opti On #3 are actually the most accurate in most experimental settings.
https://www.biostars.org/p/15693/
https://www.biostars.org/p/160808/
http://seqanswers.com/forums/showthread.php?t=16422
https://www.biostars.org/p/139186/
https://www.biostars.org/p/10219/
https://www.biostars.org/p/138521/
Http://www.broadinstitute.org/cancer/software/genepattern/rna-seq-analysis
Use of Tophat cufflinks Cuffcompare Cuffmerge