Showing tool doc from version 4.0.3.0 | The latest version is 4.1.2.0

**BETA** CreateSomaticPanelOfNormals

Make a panel of normals for use with Mutect2

Category Variant Filtering


Overview

Create a panel of normals (PoN) containing germline and artifactual sites for use with Mutect2.

The tool takes multiple normal sample callsets produced by Mutect2's tumor-only mode and collates sites present in two or more samples into a sites-only VCF. The PoN captures common artifactual and germline variant sites. Mutect2 then uses the PoN to filter variants at the site-level.

This tool is featured in the Somatic Short Mutation calling Best Practice Workflow. See Tutorial#11136 for a step-by-step description of the workflow and Article#11127 for an overview of what traditional somatic calling entails. For the latest pipeline scripts, see the Mutect2 WDL scripts directory.

Example workflow

Step 1. Run Mutect2 in tumor-only mode for each normal sample.

 gatk Mutect2 \
   -R reference.fa \
   -I normal1.bam \
   -tumor normal1_sample_name \
   --germline-resource af-only-gnomad.vcf.gz \
   -O normal1_for_pon.vcf.gz
 

Step 2. Create a file ending with .args or .list extension with the paths to the VCFs from step 1, one per line.

This approach is optional. Other extensions will error the run.

     normal1_for_pon.vcf.gz
     normal2_for_pon.vcf.gz
     normal3_for_pon.vcf.gz
 

Step 3. Combine the normal calls using CreateSomaticPanelOfNormals.

 gatk CreateSomaticPanelOfNormals \
   -vcfs normals_for_pon_vcf.args \
   -O pon.vcf.gz
 

The tool also accepts multiple .args files. Pass each in with the -vcfs option. Alternatively, provide each normal's VCF as separate arguments.

 gatk CreateSomaticPanelOfNormals \
   -vcfs normal1_for_pon_vcf.gz \
   -vcfs normal2_for_pon_vcf.gz \
   -vcfs normal3_for_pon_vcf.gz \
   -O pon.vcf.gz
 

The resulting VCF will be an eight-column sites-only VCF lacking annotations.

By default the tool fails if multiple vcfs have the same sample name, but the --duplicate-sample-strategy argument can be changed to ALLOW_ALL to allow duplicates or CHOOSE_FIRST to use only the first vcf with a given sample name.

CreateSomaticPanelOfNormals specific arguments

This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

Argument name(s) Default value Summary
Required Arguments
--output
 -O
null Output vcf
--vcfs
[] VCFs for samples to include. May be specified either one at a time, or as one or more .args file containing multiple VCFs, one per line.
Optional Tool Arguments
--arguments_file
[] read one or more arguments files and add them to the command line
--duplicate-sample-strategy
THROW_ERROR How to handle duplicate samples: THROW_ERROR to fail, CHOOSE_FIRST to use the first vcf with each sample name, ALLOW_ALL to use all samples regardless of duplicate sample names.
--gcs-max-retries
 -gcs-retries
20 If the GCS bucket channel errors out, how many times it will attempt to re-initiate the connection
--help
 -h
false display the help message
--min-sample-count
2 Number of samples containing a variant site required to include it in the panel of normals.
--version
false display the version number for this tool
Optional Common Arguments
--gatk-config-file
null A configuration file to use with the GATK.
--QUIET
false Whether to suppress job-summary info on System.err.
--TMP_DIR
[] Undocumented option
--use-jdk-deflater
 -jdk-deflater
false Whether to use the JdkDeflater (as opposed to IntelDeflater)
--use-jdk-inflater
 -jdk-inflater
false Whether to use the JdkInflater (as opposed to IntelInflater)
--verbosity
INFO Control verbosity of logging.
Advanced Arguments
--showHidden
false display hidden arguments

Argument details

Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


--arguments_file / NA

read one or more arguments files and add them to the command line

List[File]  []


--duplicate-sample-strategy / NA

How to handle duplicate samples: THROW_ERROR to fail, CHOOSE_FIRST to use the first vcf with each sample name, ALLOW_ALL to use all samples regardless of duplicate sample names.
How to handle duplicate samples: THROW_ERROR to fail, CHOOSE_FIRST to use the first vcf with each sample name, ALLOW_ALL to use all samples regardless of duplicate sample names."

The --duplicate-sample-strategy argument is an enumerated type (DuplicateSampleStrategy), which can have one of the following values:

THROW_ERROR
CHOOSE_FIRST
ALLOW_ALL

DuplicateSampleStrategy  THROW_ERROR


--gatk-config-file / NA

A configuration file to use with the GATK.

String  null


--gcs-max-retries / -gcs-retries

If the GCS bucket channel errors out, how many times it will attempt to re-initiate the connection

int  20  [ [ -∞  ∞ ] ]


--help / -h

display the help message

boolean  false


--min-sample-count / NA

Number of samples containing a variant site required to include it in the panel of normals.
Number of samples containing a variant site required to include it in the panel of normals.

int  2  [ [ -∞  ∞ ] ]


--output / -O

Output vcf

R File  null


--QUIET / NA

Whether to suppress job-summary info on System.err.

Boolean  false


--showHidden / -showHidden

display hidden arguments

boolean  false


--TMP_DIR / NA

Undocumented option

List[File]  []


--use-jdk-deflater / -jdk-deflater

Whether to use the JdkDeflater (as opposed to IntelDeflater)

boolean  false


--use-jdk-inflater / -jdk-inflater

Whether to use the JdkInflater (as opposed to IntelInflater)

boolean  false


--vcfs / -vcfs

VCFs for samples to include. May be specified either one at a time, or as one or more .args file containing multiple VCFs, one per line.
The VCFs can be input as either one or more .args file(s) containing one VCF per line, or VCFs can be specified explicitly on the command line.

R Set[File]  []


--verbosity / -verbosity

Control verbosity of logging.

The --verbosity argument is an enumerated type (LogLevel), which can have one of the following values:

ERROR
WARNING
INFO
DEBUG

LogLevel  INFO


--version / NA

display the version number for this tool

boolean  false


Return to top


See also General Documentation | Tool Docs Index Tool Docs Index | Support Forum

GATK version 4.0.3.0 built at 09-43-2018 09:43:10.