Problem with BaseRecalibrator: java.lang.IllegalArgumentException: fromIndex(64) > toIndex(62)
open | Created 2019-03-18 | Last updated 2019-03-19| Posted by bhanugandham | See in Github


Vanilla bug tools


Hi everyone,

I'm facing a similar issue with GATK v4.1.0.0 (HTSJDK v2.18.2 and Picard v2.18.25). I'm using GATK Docker image broadinstitute/gatk:4.1.0.0.

Following what I read here, I checked the bam file and everything seems fine:
gatk ValidateSamFile --INPUT sorted.bam --MODE SUMMARY

Using GATK jar /gatk/gatk-package-4.1.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.0.0-local.jar ValidateSamFile --INPUT CQ-NEQAS-2018.ILLUMINA.library.000000000-BCFDC.1.1.sorted.bam --MODE SUMMARY
16:08:17.382 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Mar 07 16:08:17 UTC 2019] ValidateSamFile  --INPUT CQ-NEQAS-2018.ILLUMINA.library.000000000-BCFDC.1.1.sorted.bam --MODE SUMMARY  --MAX_OUTPUT 100 --IGNORE_WARNINGS false --VALIDATE_INDEX true --INDEX_VALIDATION_STRINGENCY EXHAUSTIVE --IS_BISULFITE_SEQUENCED false --MAX_OPEN_TEMP_FILES 8000 --SKIP_MATE_VALIDATION false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Thu Mar 07 16:08:24 UTC 2019] Executing as mpmachado@lx-bioinfo02 on Linux 2.6.32-696.23.1.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.0.0
WARNING 2019-03-07 16:08:24     ValidateSamFile NM validation cannot be performed without the reference. All other validations will still occur.
INFO    2019-03-07 16:10:25     SamFileValidator        Validated Read    10,000,000 records.  Elapsed time: 00:02:00s.  Time for last 10,000,000:  120s.  Last read position: chr9:32,633,613
INFO    2019-03-07 16:12:22     SamFileValidator        Validated Read    20,000,000 records.  Elapsed time: 00:03:58s.  Time for last 10,000,000:  117s.  Last read position: chrM:11,340
No errors found
[Thu Mar 07 16:13:05 UTC 2019] picard.sam.ValidateSamFile done. Elapsed time: 4.79 minutes.
Runtime.totalMemory()=2602041344
Tool returned:
0

But when run BaseRecalibrator got the fromIndex toIndex error:
gatk BaseRecalibrator --input sorted.bam --output sorted.baserecalibrator_report.txt --reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.bowtie_index.fasta --use-original-qualities true --known-sites snp151common_tablebrowser.bed.bgz --known-sites snp151flagged_tablebrowser.bed.bgz

ERROR: return code 3
STDERR:
15:46:35.795 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
15:46:42.808 INFO  BaseRecalibrator - ------------------------------------------------------------
15:46:42.810 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.0.0
15:46:42.810 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
15:46:42.813 INFO  BaseRecalibrator - Executing as mpmachado@lx-bioinfo02 on Linux v2.6.32-696.23.1.el6.x86_64 amd64
15:46:42.814 INFO  BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
15:46:42.814 INFO  BaseRecalibrator - Start Date/Time: March 7, 2019 3:46:35 PM UTC
15:46:42.815 INFO  BaseRecalibrator - ------------------------------------------------------------
15:46:42.815 INFO  BaseRecalibrator - ------------------------------------------------------------
15:46:42.817 INFO  BaseRecalibrator - HTSJDK Version: 2.18.2
15:46:42.817 INFO  BaseRecalibrator - Picard Version: 2.18.25
15:46:42.817 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:46:42.818 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:46:42.818 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:46:42.818 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:46:42.819 INFO  BaseRecalibrator - Deflater: IntelDeflater
15:46:42.819 INFO  BaseRecalibrator - Inflater: IntelInflater
15:46:42.819 INFO  BaseRecalibrator - GCS max retries/reopens: 20
15:46:42.819 INFO  BaseRecalibrator - Requester pays: disabled
15:46:42.820 INFO  BaseRecalibrator - Initializing engine
15:46:43.760 INFO  FeatureManager - Using codec BEDCodec to read file file:///snp151common_tablebrowser.bed.bgz
15:46:44.016 INFO  FeatureManager - Using codec BEDCodec to read file file:///snp151flagged_tablebrowser.bed.bgz
15:46:44.076 WARN  IndexUtils - Feature file "snp151common_tablebrowser.bed.bgz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
15:46:44.500 WARN  IndexUtils - Feature file "snp151flagged_tablebrowser.bed.bgz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
15:46:44.798 INFO  BaseRecalibrator - Done initializing engine
15:46:44.936 INFO  BaseRecalibrationEngine - The covariates being used here:
15:46:44.936 INFO  BaseRecalibrationEngine - 	ReadGroupCovariate
15:46:44.937 INFO  BaseRecalibrationEngine - 	QualityScoreCovariate
15:46:44.937 INFO  BaseRecalibrationEngine - 	ContextCovariate
15:46:44.937 INFO  BaseRecalibrationEngine - 	CycleCovariate
15:46:44.953 INFO  ProgressMeter - Starting traversal
15:46:44.953 INFO  ProgressMeter -        Current Locus  Elapsed Minutes       Reads Processed     Reads/Minute
15:46:45.866 INFO  BaseRecalibrator - Shutting down engine
[March 7, 2019 3:46:45 PM UTC] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.17 minutes.
Runtime.totalMemory()=731381760
java.lang.IllegalArgumentException: fromIndex(64) > toIndex(62)
	at java.util.Arrays.rangeCheck(Arrays.java:113)
	at java.util.Arrays.fill(Arrays.java:3044)
	at org.broadinstitute.hellbender.utils.recalibration.BaseRecalibrationEngine.calculateKnownSites(BaseRecalibrationEngine.java:354)
	at org.broadinstitute.hellbender.utils.recalibration.BaseRecalibrationEngine.calculateSkipArray(BaseRecalibrationEngine.java:322)
	at org.broadinstitute.hellbender.utils.recalibration.BaseRecalibrationEngine.processRead(BaseRecalibrationEngine.java:137)
	at org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator.apply(BaseRecalibrator.java:185)
	at org.broadinstitute.hellbender.engine.ReadWalker.lambda$traverse$0(ReadWalker.java:91)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.ReadWalker.traverse(ReadWalker.java:89)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
	at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /gatk/gatk-package-4.1.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.0.0-local.jar BaseRecalibrator --input sorted.bam --output sorted.baserecalibrator_report.txt --reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.bowtie_index.fasta --use-original-qualities true --known-sites snp151common_tablebrowser.bed.bgz --known-sites snp151flagged_tablebrowser.bed.bgz

I downsampled the fastq files and got similar results.
However, when giving only the reduced known-sites file (--known-sites snp151flagged_tablebrowser.bed.bgz) and specifying two intervals (--intervals chr22 --intervals chrY), it worked.

I attached the downsampled bam file and the reduced known-sites file here, and the reference file can be found here.

I hope you can help me understanding what is going on and how to fix it.

Thank you in advance.

Best regards,

Miguel

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/comment/57049#Comment_57049


Return to top