TranspoSeqWhat is TranspoSeq? TranspoSeq identifies non-reference somatic retrotransposon insertions given tumor and normal BAM files. Email transposeq@gmail.com with any questions.
Source Code Download the source: TranspoSeq_v2.0.tar.gz
Reference Files TranspoSeq requires reference files of approximately 9G (compressed). These can either be downloaded in bulk or in several downloads via the categorized table below.
How does TranspoSeq work? TranspoSeq is a computational framework that takes in paired-end sequencing data and produces a list of annotated putative somatic retrotransposon insertion sites. First, input BAMs are parsed for discordant read-pairs; these pairs are then aligned to a consensus retrotransposon sequence. Pairs with one read aligning to the retrotransposon database and the other aligning to the reference genome with little ambiguity are clustered in the forward and reverse directions. Overlaps of clusters are identified and annotated to support a putative non-reference retrotransposon at the given genomic position. Finally, the read-pairs within each cluster are assembled de-novo and the resulting contig is aligned to both the reference and retrotransposon database to annotate the element that was inserted. Events with strong evidence that pass filtering criteria are retained and classified as somatic or germline. |