Spiderplot documentation
Spiderplot
The Spiderplot utility produces plots showing haplotype structure around a variant (typically a CNV), similar to Figure 6 in DOI 10.1038/ng.3200.
This is a new version of the spiderplot code which is much faster and more scalable. This version does not currently support the -population argument to produce multiple plots (one per population). If you need to plot multiple populations, you need to plot them one at a time.
Input
The primary input is a phased VCF file that should contain the variant of interest along with phased markers flanking the variant.
The optional -plotGroupFile input allows you to specify additional decorations that are added to the plot.
The plot decoration group file should be tab-delimited file with header with the following columns:
- LABEL
- The the legend label for the group.
- TYPE
- Must be either POINT or BAR.
- COLOR
- Any valid R color.
- SAMPLE
- A sample or haplotype identifier/group, which can be the sample ID, a hap ID (e.g. SAMPLE-1) or an allele label.
Fine control over plotting can be achieved by overriding the default plotting script (spiderplot.R) using the -plottingScript option.
Example
java -Xmx4g -cp SVToolkit.jar:GenomeAnalysisTK.jar \
org.broadinstitute.sv.apps.Spiderplot \
-R reference.fasta \
-vcf phased_vcf_file.vcf.gz \
-O output_plot.pdf \
-site cnvSiteID \
-flankWidth 50000 \
-alleleFrequencyThreshold 0.01
Spiderplot specific arguments
| Name | Type | Default value | Summary |
|---|---|---|---|
| Required Inputs | |||
| -vcf | File | NA | Input vcf file containing phased haplotypes |
| Required Parameters | |||
| -R | File | NA | Reference file (indexed fasta file) |
| Optional Outputs | |||
| -log | String | NA | Set the logging location |
| -O | File | NA | Output file (pdf) |
| Optional Parameters | |||
| -alleleCountThreshold | Integer | NA | Minimum minor allele count for plotted markers |
| -alleleFrequencyThreshold | Double | NA | Minimum minor allele frequency for plotted markers |
| -alleleLabelMapFile | List[File] | NA | Map file or files containing a mapping from VCF alleles to allele labels |
| -colorMapFile | File | NA | Tab-delimited file mapping alleles to colors (any color value recognized by R). |
| -flankMarkerCount | Integer | NA | Size in markers of flanks to plot (default no limit) |
| -flankWidth | Integer | 100000 | Size in base pairs of flanks to plot (default 100,000) |
| -hapIdMapFile | List[File] | NA | Map file or files containing alternate haplotype IDs to use |
| -hapLabelMapFile | List[File] | NA | Map file or files containing the allele to assign to each haplotype |
| -L | String | NA | Specific interval to plot (overrides flankWidth) |
| -l | String | INFO | Set the minimum level of logging |
| -plotGroupFile | File | NA | Tab-delimited file describing plot decorations on haplotypes |
| -plotHapIds | String | NA | Whether to plot haplotype IDs (e.g. sample-1, sample-2) on the plot (boolean, default false) |
| -plotHeight | String | NA | Plot height in inches or "auto" to auto scale for large plots (default 8) |
| -plotTitle | String | NA | Plot title |
| -plotWidth | String | NA | Plot width in inches (default 10.5) |
| -population | List[String] | NA | Population(s) or .list file of populations |
| -populationMapFile | List[File] | NA | Map file or files containing sample to population assignments |
| -sample | List[String] | NA | Sample or samples to plot (or .list file) |
| -site | String | NA | Site ID to plot |
| -siteInterval | String | NA | Explicitly set the start/end position for the target site |
| -verbose | String | NA | Enable extra progress output |
| Optional Flags | |||
| -h | Flag | NA | Generate the help message |
| -version | Flag | NA | Output version information |
| Advanced Parameters | |||
| -debug | String | NA | Enable verbose debugging output |
| -hapFile | File | NA | Location of generated text file with input for plotting script |
| -hapTreeFile | File | NA | Location of generated tree file with input for plotting script |
| -P | List[String] | NA | Override individual configuration parameters |
| -plottingScript | String | NA | Custom plotting script to use (instead of default script) |
Argument details
--alleleCountThreshold / -alleleCountThreshold ( Integer )
Minimum minor allele count for plotted markers.
--alleleFrequencyThreshold / -alleleFrequencyThreshold ( Double )
Minimum minor allele frequency for plotted markers.
--alleleLabelMapFile / -alleleLabelMapFile ( List[File] )
Map file or files containing a mapping from VCF alleles to allele labels.
--colorMapFile / -colorMapFile ( File )
Tab-delimited file mapping alleles to colors (any color value recognized by R)..
--debug / -debug ( String )
Enable verbose debugging output.
--flankMarkerCount / -flankMarkerCount ( Integer )
Size in markers of flanks to plot (default no limit).
--flankWidth / -flankWidth ( Integer with default value 100000 )
Size in base pairs of flanks to plot (default 100,000).
--hapFile / -hapFile ( File )
Location of generated text file with input for plotting script.
--hapIdMapFile / -hapIdMapFile ( List[File] )
Map file or files containing alternate haplotype IDs to use.
--hapLabelMapFile / -hapLabelMapFile ( List[File] )
Map file or files containing the allele to assign to each haplotype.
--hapTreeFile / -hapTreeFile ( File )
Location of generated tree file with input for plotting script.
--help / -h ( Flag )
Generate the help message.
--interval / -L ( String )
Specific interval to plot (overrides flankWidth).
--log_to_file / -log ( String )
Set the logging location.
--logging_level / -l ( String with default value INFO )
Set the minimum level of logging.
--outputFile / -O ( File )
Output file (pdf).
--parameter / -P ( List[String] )
Override individual configuration parameters.
--plotGroupFile / -plotGroupFile ( File )
Tab-delimited file describing plot decorations on haplotypes.
--plotHapIds / -plotHapIds ( String )
Whether to plot haplotype IDs (e.g. sample-1, sample-2) on the plot (boolean, default false).
--plotHeight / -plotHeight ( String )
Plot height in inches or "auto" to auto scale for large plots (default 8).
--plottingScript / -plottingScript ( String )
Custom plotting script to use (instead of default script).
--plotTitle / -plotTitle ( String )
Plot title.
--plotWidth / -plotWidth ( String )
Plot width in inches (default 10.5).
--population / -population ( List[String] )
Population(s) or .list file of populations.
--populationMapFile / -populationMapFile ( List[File] )
Map file or files containing sample to population assignments.
--referenceFile / -R ( required File )
Reference file (indexed fasta file).
--sample / -sample ( List[String] )
Sample or samples to plot (or .list file).
--site / -site ( String )
Site ID to plot.
--siteInterval / -siteInterval ( String )
Explicitly set the start/end position for the target site.
--vcfFile / -vcf ( required File )
Input vcf file containing phased haplotypes.
--verbose / -verbose ( String )
Enable extra progress output.
--version / -version ( Flag )
Output version information.
