Spiderplot documentation
Spiderplot
The Spiderplot utility produces plots showing haplotype structure around a variant (typically a CNV), similar to Figure 6 in DOI 10.1038/ng.3200.
This is a new version of the spiderplot code which is much faster and more scalable. This version does not currently support the -population argument to produce multiple plots (one per population). If you need to plot multiple populations, you need to plot them one at a time.
Input
The primary input is a phased VCF file that should contain the variant of interest along with phased markers flanking the variant.
The optional -plotGroupFile input allows you to specify additional decorations that are added to the plot.
The plot decoration group file should be tab-delimited file with header with the following columns:
- LABEL
- The the legend label for the group.
- TYPE
- Must be either POINT or BAR.
- COLOR
- Any valid R color.
- SAMPLE
- A sample or haplotype identifier/group, which can be the sample ID, a hap ID (e.g. SAMPLE-1) or an allele label.
Fine control over plotting can be achieved by overriding the default plotting script (spiderplot.R) using the -plottingScript option.
Example
java -Xmx4g -cp SVToolkit.jar:GenomeAnalysisTK.jar \ org.broadinstitute.sv.apps.Spiderplot \ -R reference.fasta \ -vcf phased_vcf_file.vcf.gz \ -O output_plot.pdf \ -site cnvSiteID \ -flankWidth 50000 \ -alleleFrequencyThreshold 0.01
Spiderplot specific arguments
Name | Type | Default value | Summary |
---|---|---|---|
Required Inputs | |||
-vcf | File | NA | Input vcf file containing phased haplotypes |
Required Parameters | |||
-R | File | NA | Reference file (indexed fasta file) |
Optional Outputs | |||
-log | String | NA | Set the logging location |
-O | File | NA | Output file (pdf) |
Optional Parameters | |||
-alleleCountThreshold | Integer | NA | Minimum minor allele count for plotted markers |
-alleleFrequencyThreshold | Double | NA | Minimum minor allele frequency for plotted markers |
-alleleLabelMapFile | List[File] | NA | Map file or files containing a mapping from VCF alleles to allele labels |
-colorMapFile | File | NA | Tab-delimited file mapping alleles to colors (any color value recognized by R). |
-flankMarkerCount | Integer | NA | Size in markers of flanks to plot (default no limit) |
-flankWidth | Integer | 100000 | Size in base pairs of flanks to plot (default 100,000) |
-hapIdMapFile | List[File] | NA | Map file or files containing alternate haplotype IDs to use |
-hapLabelMapFile | List[File] | NA | Map file or files containing the allele to assign to each haplotype |
-L | String | NA | Specific interval to plot (overrides flankWidth) |
-l | String | INFO | Set the minimum level of logging |
-plotGroupFile | File | NA | Tab-delimited file describing plot decorations on haplotypes |
-plotHapIds | String | NA | Whether to plot haplotype IDs (e.g. sample-1, sample-2) on the plot (boolean, default false) |
-plotHeight | String | NA | Plot height in inches or "auto" to auto scale for large plots (default 8) |
-plotTitle | String | NA | Plot title |
-plotWidth | String | NA | Plot width in inches (default 10.5) |
-population | List[String] | NA | Population(s) or .list file of populations |
-populationMapFile | List[File] | NA | Map file or files containing sample to population assignments |
-sample | List[String] | NA | Sample or samples to plot (or .list file) |
-site | String | NA | Site ID to plot |
-siteInterval | String | NA | Explicitly set the start/end position for the target site |
-verbose | String | NA | Enable extra progress output |
Optional Flags | |||
-h | Flag | NA | Generate the help message |
-version | Flag | NA | Output version information |
Advanced Parameters | |||
-debug | String | NA | Enable verbose debugging output |
-hapFile | File | NA | Location of generated text file with input for plotting script |
-hapTreeFile | File | NA | Location of generated tree file with input for plotting script |
-P | List[String] | NA | Override individual configuration parameters |
-plottingScript | String | NA | Custom plotting script to use (instead of default script) |
Argument details
--alleleCountThreshold / -alleleCountThreshold ( Integer )
Minimum minor allele count for plotted markers.
--alleleFrequencyThreshold / -alleleFrequencyThreshold ( Double )
Minimum minor allele frequency for plotted markers.
--alleleLabelMapFile / -alleleLabelMapFile ( List[File] )
Map file or files containing a mapping from VCF alleles to allele labels.
--colorMapFile / -colorMapFile ( File )
Tab-delimited file mapping alleles to colors (any color value recognized by R)..
--debug / -debug ( String )
Enable verbose debugging output.
--flankMarkerCount / -flankMarkerCount ( Integer )
Size in markers of flanks to plot (default no limit).
--flankWidth / -flankWidth ( Integer with default value 100000 )
Size in base pairs of flanks to plot (default 100,000).
--hapFile / -hapFile ( File )
Location of generated text file with input for plotting script.
--hapIdMapFile / -hapIdMapFile ( List[File] )
Map file or files containing alternate haplotype IDs to use.
--hapLabelMapFile / -hapLabelMapFile ( List[File] )
Map file or files containing the allele to assign to each haplotype.
--hapTreeFile / -hapTreeFile ( File )
Location of generated tree file with input for plotting script.
--help / -h ( Flag )
Generate the help message.
--interval / -L ( String )
Specific interval to plot (overrides flankWidth).
--log_to_file / -log ( String )
Set the logging location.
--logging_level / -l ( String with default value INFO )
Set the minimum level of logging.
--outputFile / -O ( File )
Output file (pdf).
--parameter / -P ( List[String] )
Override individual configuration parameters.
--plotGroupFile / -plotGroupFile ( File )
Tab-delimited file describing plot decorations on haplotypes.
--plotHapIds / -plotHapIds ( String )
Whether to plot haplotype IDs (e.g. sample-1, sample-2) on the plot (boolean, default false).
--plotHeight / -plotHeight ( String )
Plot height in inches or "auto" to auto scale for large plots (default 8).
--plottingScript / -plottingScript ( String )
Custom plotting script to use (instead of default script).
--plotTitle / -plotTitle ( String )
Plot title.
--plotWidth / -plotWidth ( String )
Plot width in inches (default 10.5).
--population / -population ( List[String] )
Population(s) or .list file of populations.
--populationMapFile / -populationMapFile ( List[File] )
Map file or files containing sample to population assignments.
--referenceFile / -R ( required File )
Reference file (indexed fasta file).
--sample / -sample ( List[String] )
Sample or samples to plot (or .list file).
--site / -site ( String )
Site ID to plot.
--siteInterval / -siteInterval ( String )
Explicitly set the start/end position for the target site.
--vcfFile / -vcf ( required File )
Input vcf file containing phased haplotypes.
--verbose / -verbose ( String )
Enable extra progress output.
--version / -version ( Flag )
Output version information.