M2 error with canine germline resource and variants_for_contamination files
open | Created 2019-08-19 | Last updated 2019-09-11| Posted by schaluva | See in Github


Mutect Vanilla bug


As discussed here is the information regarding the OutOfBoundsException. Please find below the user report:

I am running the Mutect2 pipeline on canine tumor samples in Terra, using WDL version 2.5 and GATK version 4.1.2.0. I was able to run the pipeline successfully without entering a germline resource file or VCF of common variants for contamination, however, when I did add these files in, I got the following error:

java.lang.IndexOutOfBoundsException: Index: 5, Size: 5
	at java.util.ArrayList.rangeCheck(ArrayList.java:657)
	at java.util.ArrayList.get(ArrayList.java:433)
	at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.lambda$getGermlineAltAlleleFrequencies$31(SomaticGenotypingEngine.java:350)
	at java.util.stream.ReferencePipeline$6$1.accept(ReferencePipeline.java:244)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
	at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
	at java.util.stream.DoublePipeline.toArray(DoublePipeline.java:506)
	at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.getGermlineAltAlleleFrequencies(SomaticGenotypingEngine.java:352)
	at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.getNegativeLogPopulationAFAnnotation(SomaticGenotypingEngine.java:335)
	at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.callMutations(SomaticGenotypingEngine.java:141)
	at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:250)
	at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:324)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:308)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:281)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1039)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
	at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3000m -jar /root/gatk.jar Mutect2 -R gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/Canis_lupus_familiaris_assembly3.fasta -I gs://fc-8268e82b-ed61-4e04-a8c9-a95a05c0952e/bda6f5ba-8928-45bf-a6b0-9fe67d8dd9a4/PreProcessingForVariantDiscovery_GATK4/cccdda67-56e1-4363-aa6c-46ce53ef8afd/call-GatherBamFiles/attempt-2/Abrams_cell.bam -tumor Abrams_1 --germline-resource gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/canid_wgs_ref.1.0.no_samples.vcf.gz -pon gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/b92f3c35-5813-455b-94dc-3de3b54f5f98/Mutect2_Panel/c9f21d8a-384e-4d17-a6f8-79a502698827/call-MergeVCFs/1-Mutect2_PON_2019-07-25T22-08-49.vcf -L gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/f2138b33-3918-4f8a-9b87-1823a0084ac3/Mutect2/c4844164-ecad-4878-9e5d-cd134a7fb40d/call-SplitIntervals/glob-0fc990c5ca95eebc97c4c204e3e303e1/0000-scattered.interval_list -O output.vcf --f1r2-tar-gz f1r2.tar.gz --af-of-alleles-not-in-resource 0.0007 --downsampling-stride 20 --max-reads-per-alignment-start 6 --max-suspicious-reads-per-alignment-start 6`

The germline resource is a VCF of approximately 80 million SNPs and indels (including multi allelic sites) called from a large number of canine WGS. It is formatted as a VCF with no sample information:

chr1    240     .       TG      T       464.40  PASS    AC=4;AF=0.011;AN=332;BaseQRankSum=0.674;ClippingRankSum=0;DP=14798;ExcessHet=0.0026;FS=5.63;InbreedingCoeff=-0.005;MLEAC=14;MLEAF=0.017;MQ=7.49;MQRankSum=-0.967;QD=22.11;ReadPosRankSum=0.967;SOR=3.18

The VCF for variants for contamination is a subset of this VCF, with only biallelic SNPs with AF between 0.01 and 0.2. Initially, it was formatted the same as the above file.

As part of debugging, I tried removing everything from the INFO field of the variants for contamination file, except allele frequency, and I tried using that simplified VCF both for the germline resource and the variants for contamination file. This seemed to fix the index out of bounds error, but the job then failed at the filtering step, with the following error:

java.lang.IllegalArgumentException: log10p: Log10-probability must be 0 or less
	at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:724)
	at org.broadinstitute.hellbender.utils.MathUtils.log10BinomialProbability(MathUtils.java:934)
	at org.broadinstitute.hellbender.utils.MathUtils.binomialProbability(MathUtils.java:927)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ContaminationFilter.calculateErrorProbability(ContaminationFilter.java:56)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
	at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6500m -jar /root/gatk.jar FilterMutectCalls -V gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-MergeVCFs/Abrams_cell-unfiltered.vcf -R gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/Canis_lupus_familiaris_assembly3.fasta -O Abrams_cell-filtered.vcf --contamination-table /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-CalculateContamination/contamination.table --tumor-segmentation /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-CalculateContamination/segments.table --ob-priors /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-LearnReadOrientationModel/artifact-priors.tar.gz -stats /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-MergeStats/merged.stats --filtering-stats filtering.stats --min-median-read-position 10

Both of these tests were run on an interval that included a single chromosome (approximately 24Mb).

Thank you for your help!

Best,
Kate

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/24345/m2-error-with-canine-germline-resource-and-variants-for-contamination-files/p1


Return to top