VariantsPerSampleAnnotator documentation

VariantsPerSampleAnnotator

Annotator that reports on the distribution of variant calls across samples.

Category: Variant Annotators

The VariantSampleCoverage annotator is invoked through the SVAnnotator framework, which defines arguments common to all annotators.

Introduction

The VariantsPerSample annotator reports the distribution of variant calls across samples. It reports both total variants in each sample and the number of singletons.

Input Formats

Population map files are tab delimited files with two columns. The first column specifies the sample identifier and the second column specifies a population identifier. A header line is optional, but if present the column names should be SAMPLE and POPULATION.

Output Formats

This annotator can produce the following outputs: report file.

The report file contains one line per sample, with the following columns:

SAMPLE
The sample identifier.
POPULATION
The population for this sample.
This column is emitted as a convenience, but the population map is not othewise used by this annotator.
VARIANTS
The number of sites called variant for this sample.
Specifically, the number of sites where the sample carries at least one non-reference allele.
SINGLETONS
The number of singleton sites, sites where this sample is the only variant sample in the evaluated cohort.

Example

 java -Xmx4g -cp SVToolkit.jar \
     org.broadinstitute.sv.main.SVAnnotator \
     -A VariantsPerSample \
     -R human_g1k_v37.fasta \
     -vcf input.vcf \
     -writeReport true \
     -reportDirectory reportdir
 


VariantsPerSampleAnnotator specific arguments

Name Type Default value Summary
Optional Parameters
-filterGenotypes String NA True to ignore genotypes that have been filtered (default true)
-filterVariants String NA True to ignore variants that have been filtered (default true)
-genotypeQualityThreshold Double NA Ignore genotypes below this genotype quality GQ/CNQ value (default no threshold)
-populationMapFile List[File] NA Map file (or files) containing sample to population assignments
-sample List[String] NA Sample(s) or .list file of sample names. If specified, only the listed samples will be used to evaluate the variants.

Argument details

--filterGenotypes / -filterGenotypes ( String )

True to ignore genotypes that have been filtered (default true).

--filterVariants / -filterVariants ( String )

True to ignore variants that have been filtered (default true).

--genotypeQualityThreshold / -genotypeQualityThreshold ( Double )

Ignore genotypes below this genotype quality GQ/CNQ value (default no threshold).

--populationMapFile / -populationMapFile ( List[File] )

Map file (or files) containing sample to population assignments.

--sample / -sample ( List[String] )

Sample(s) or .list file of sample names. If specified, only the listed samples will be used to evaluate the variants..