Genome STRiP

 

News

We have created a public demonstration workspace on Terra that illustrates a custom detailed analysis of C4 using Genome STRiP: https://app.terra.bio/#workspaces/mccarroll-genomestrip-terra/C4AB_Analysis. This is an updated version of the analysis done in the preprint by Nolan Kamitaki from the McCarroll Lab: Complement component 4 genes contribute sex-specific vulnerability in diverse illnesses. Documentation on the workflow is available here: C4 A/B Analysis Workflow.

Overview

Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.

Genome STRiP looks both across and within a set of sequenced genomes to detect variation. The methods are adaptive and support heterogeneous data sets, including variations in sequencing depth, read lengths and mixtures of paired and single-end reads. A minimum of 20 to 30 genomes are required to get acceptable results, but the method gains power across genomes and processing more genomes provide better results.

To run discovery or genotyping on a single sequenced genome or a small set of genomes, you need to call your data against a background population, such as a set of genomes from the 1000 Genomes Project.  The background population does not need to be matched to the target individuals.

Genome STRiP can be used for discovery of novel structural variations or to genotype known variants in new samples.

Citing Genome STRiP

Please use the followings references to cite Genome STRiP

For Genome STRiP 2.0 (including duplication and mCNV analysis):

Handsaker RE, Van Doren V, Berman JR, Genovese G, Kashin S, Boettger LM, McCarroll SA
Large multiallelic copy number variations in humans
Nature Genetics 47, 296-303 (2015)
PMID: 25621458

Supplementary Data

For the original set of methods for deletion analysis:

Handsaker RE, Korn JM, Nemesh J, McCarroll SA
Discovery and genotyping of genome structural polymorphism by sequencing on a population scale
Nature Genetics 43, 269-276 (2011)
PMID: 21317889