Bugs & Feature Requests
List of known bugs, limitations and requested enhancements


These are all the issues tagged "bug" in the GATK4 development repository. If you have reported a bug and cannot find it in this list, let us know in the thread where you reported it. Note that we are no longer listing or processing issues in older versions of GATK (3.x and prior) unless there is substantial negative impact.





Created 2017-11-15 | Updated 2017-11-20 | bug PRIORITY_HIGH Spark


PrintVariantsSpark crashes on dataproc with serialization issues.

Ex:

Running:
    gcloud dataproc jobs submit spark --cluster gatk-test-8875b999-b609-4a3f-86ea-973b929fe662 --properties spark.driver.userClassPathFirst=true,spark.io.compression.codec=lzf,spark.driver.maxResultSize=0,spark.executor.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.driver.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.kryoserializer.buffer.max=512m,spark.yarn.executor.memoryOverhead=600 --jar gs://hellbender-test-logs/test/staging/lb_staging/gatk-package-4.beta.6-37-g0a135f8-SNAPSHOT-spark_7002d0551e84ddef0d74adf95dfee104.jar -- PrintVariantsSpark --V gs://hellbender/test/resources/large/gvcfs/gatk3.7_30_ga4f720357.24_sample.21.expected.vcf --output gs://hellbender-test-logs/test/staging/lb_staging/756f43e6-4663-49ce-8a8c-bf717b07a8c7.vcf --sparkMaster yarn
Job [dfac787d-19aa-4296-8078-c033cd9f440d] submitted.
Waiting for job output...
19:43:09.678 WARN  SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
19:43:09.837 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/tmp/dfac787d-19aa-4296-8078-c033cd9f440d/gatk-package-4.beta.6-37-g0a135f8-SNAPSHOT-spark_7002d0551e84ddef0d74adf95dfee104.jar!/com/intel/gkl/native/libgkl_compression.so
[November 15, 2017 7:43:09 PM UTC] PrintVariantsSpark  --output gs://hellbender-test-logs/test/staging/lb_staging/756f43e6-4663-49ce-8a8c-bf717b07a8c7.vcf --variant gs://hellbender/test/resources/large/gvcfs/gatk3.7_30_ga4f720357.24_sample.21.expected.vcf --sparkMaster yarn  --variantShardSize 10000 --variantShardPadding 1000 --shuffle false --readValidationStringency SILENT --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --interval_merging_rule ALL --bamPartitionSize 0 --disableSequenceDictionaryValidation false --shardedOutput false --numReducers 0 --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --gcs_max_retries 20 --disableToolDefaultReadFilters false
[November 15, 2017 7:43:09 PM UTC] Executing as root@gatk-test-8875b999-b609-4a3f-86ea-973b929fe662-m on Linux 3.16.0-4-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_131-8u131-b11-1~bpo8+1-b11; Version: 4.beta.6-37-g0a135f8-SNAPSHOT
19:43:09.992 INFO  PrintVariantsSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 1
19:43:09.992 INFO  PrintVariantsSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
19:43:09.993 INFO  PrintVariantsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
19:43:09.993 INFO  PrintVariantsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
19:43:09.993 INFO  PrintVariantsSpark - Deflater: IntelDeflater
19:43:09.993 INFO  PrintVariantsSpark - Inflater: IntelInflater
19:43:09.993 INFO  PrintVariantsSpark - GCS max retries/reopens: 20
19:43:09.993 INFO  PrintVariantsSpark - Using google-cloud-java patch c035098b5e62cb4fe9155eff07ce88449a361f5d from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
19:43:09.993 INFO  PrintVariantsSpark - Initializing engine
19:43:09.993 INFO  PrintVariantsSpark - Done initializing engine
17/11/15 19:43:11 INFO org.spark_project.jetty.util.log: Logging initialized @4976ms
17/11/15 19:43:11 INFO org.spark_project.jetty.server.Server: jetty-9.3.z-SNAPSHOT
17/11/15 19:43:11 INFO org.spark_project.jetty.server.Server: Started @5092ms
17/11/15 19:43:11 INFO org.spark_project.jetty.server.AbstractConnector: Started ServerConnector@5917b44d{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
17/11/15 19:43:12 INFO com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase: GHFS version: 1.6.1-hadoop2
17/11/15 19:43:13 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at gatk-test-8875b999-b609-4a3f-86ea-973b929fe662-m/10.240.0.18:8032
17/11/15 19:43:17 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1510774921124_0001
17/11/15 19:43:28 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input files to process : 1
17/11/15 19:43:35 ERROR org.apache.spark.scheduler.TaskResultGetter: Exception while getting task result
com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
Serialization trace:
genotypes (org.seqdoop.hadoop_bam.VariantContextWithHeader)
interval (org.broadinstitute.hellbender.engine.spark.SparkSharder$PartitionLocatable)
	at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:65)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:396)
	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:307)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
	at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:330)
	at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:88)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:72)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:62)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: htsjdk.variant.variantcontext.LazyGenotypesContext
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:677)
	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1826)
	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
	at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:63)
	... 20 more
17/11/15 19:43:35 INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@5917b44d{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
17/11/15 19:43:35 WARN org.apache.spark.ExecutorAllocationManager: No stages are running, but numRunningTasks != 0
19:43:35.858 INFO  PrintVariantsSpark - Shutting down engine
[November 15, 2017 7:43:35 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.PrintVariantsSpark done. Elapsed time: 0.43 minutes.
Runtime.totalMemory()=823132160
org.apache.spark.SparkException: Job aborted due to stage failure: Exception while getting task result: com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
Serialization trace:
genotypes (org.seqdoop.hadoop_bam.VariantContextWithHeader)
interval (org.broadinstitute.hellbender.engine.spark.SparkSharder$PartitionLocatable)
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
	at scala.Option.foreach(Option.scala:257)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:935)
	at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:361)
	at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
	at org.broadinstitute.hellbender.engine.spark.SparkSharder.computePartitionReadExtents(SparkSharder.java:274)
	at org.broadinstitute.hellbender.engine.spark.SparkSharder.joinOverlapping(SparkSharder.java:163)
	at org.broadinstitute.hellbender.engine.spark.SparkSharder.joinOverlapping(SparkSharder.java:128)
	at org.broadinstitute.hellbender.engine.spark.SparkSharder.shard(SparkSharder.java:101)
	at org.broadinstitute.hellbender.engine.spark.VariantWalkerSpark.getVariants(VariantWalkerSpark.java:129)
	at org.broadinstitute.hellbender.engine.spark.VariantWalkerSpark.runTool(VariantWalkerSpark.java:160)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:362)
	at org.broadinstitute.hellbender.engine.spark.VariantWalkerSpark.runPipeline(VariantWalkerSpark.java:57)
	at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:137)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:158)
	at org.broadinstitute.hellbender.Main.main(Main.java:239)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
ERROR: (gcloud.dataproc.jobs.submit.spark) Job [dfac787d-19aa-4296-8078-c033cd9f440d] entered state [ERROR] while waiting for [DONE].


Created 2017-11-13 | Updated 2017-11-14 | bug Copy Number tools Spark


On branch ll_CollectAllelicCountsSpark, I have created a CLI called: CollectAllelicCountsSpark ... This tool will have the exact same functionality as CollectAllelicCounts, to the point where I can re-use the integration tests.

However, the integration tests fail. When I dig deeper in CollectAllelicCountsSpark, I see that only 8 RDDs (correct amount: 11) are being passed to processAlignments... Consider the following code:

 @Override
    protected void processAlignments(JavaRDD<LocusWalkerContext> rdd, JavaSparkContext ctx) {
        final String sampleName = SampleNameUtils.readSampleName(getHeaderForReads());
        final SampleMetadata sampleMetadata = new SimpleSampleMetadata(sampleName);
        final Broadcast<SampleMetadata> sampleMetadataBroadcast = ctx.broadcast(sampleMetadata);

        final AllelicCountCollector finalAllelicCountCollector =
                rdd.mapPartitions(distributedCount(sampleMetadataBroadcast.getValue(), minimumBaseQuality))
                .reduce((a1, a2) -> combineAllelicCountCollectors(a1, a2, sampleMetadataBroadcast.getValue()));
        final List<LocusWalkerContext> tmp = rdd.collect();
 ....snip....

In this case tmp will have a size of 8. However, the integration test would indicate a size of 11 is correct, since 11 intervals are being passed in. Note that emitEmptyLoci() returns true, so 11 is the correct number as seen in CollectAllelicCountsSparkIntegrationTest .

Additionally, in (at least) one result, the counts are wrong.

CollectAllelicCounts (non-spark) passes the integration test.

I have tried a couple of tests to gather more information:

  • Is emitEmptyLoci() causing an issue?
    Does not appear to be causing the issue. I say this because when set to false, I get (essentially) the same error.
  • The code uses mapPartition and not map, does this cause the issue? Why are you doing this?
    This does not cause the issue. I refactored the code to use map and got the exact same issue. I use mapPartition in order to instantiate only one instance of AllelicCountCollector per partition, instead of per locus.

Assigning to @tomwhite by request of @droazen ...



Created 2017-11-09 | Updated 2017-11-09 | bug GenomicsDB PRIORITY_HIGH


@lbergelson reports that GenomicsDBImport by default will continue silently when the target GenomicsDB already exists, instead of throwing, resulting in a corrupt GenomicsDB. The default should be to throw.



Created 2017-11-09 | Updated 2017-11-14 | bug GenomicsDB PRIORITY_HIGH


It looks like we've found a bug in GenomicsDB.

We had a project with 26 replicates (same sample in there twice) among another ~1000 samples. Therefore we uniquified the names in the sample map that’s input to TileDB for those 26 samples (i.e. converted them to project.sample instead of just sample) -- but obviously the names in the gvcfs remained unaltered.

When we look at the output VCF from GenomicsDB, there's definitely a problem. These 52 samples are the first ones in the list and here's what we see:

The first 26 samples (the first occurrence of the replicates) are fine.
Then the next 24 samples (the second occurrence of the replicates) are all “.:0,0” (i.e. empty) for all columns in the VCF.
Then the next 2 samples (also second occurrences of replicates) are fine.

Given that our batch size was 50 when importing into GenomicsDB, this looks suspiciously like an error with the batching. So within a batch it looks like it’s not respecting the renaming somehow?

@kgururaj



Created 2017-11-08 | Updated 2017-11-13 | bug HaplotypeCaller


We have a problematic read that goes in as cigar 14S45M92S (or something similar) and turns 137H14S when clipped by ReadClipper.hardClipLowQualEnds (AssemblyBasedCallerUtils::L73). This is because the first 137 are base quality 2. M2 (and HC) fails later on because we miscalculate the start position as -4. This only occurs when a specific interval file is used. When run with -L chrM, the we don't get the error, for example.

Problematic bam: /seq/picard_aggregation/RP-1490/WGS/OCB006937/v2/OCB006937.bam
Read Name: H3FWGCCXY170920
Position: chrM:147-146 (this is what I wrote down but I'm not sure if it's right)
Reference: /seq/references/Mus_musculus_assembly10/v0/Mus_musculus_assembly10.fasta (this is a mouse reference)

java.lang.IllegalArgumentException: contig must be non-null and not equal to *, and start must be >= 1
contig = chrM
start = -4
	at org.broadinstitute.hellbender.utils.read.SAMRecordToGATKReadAdapter.setPosition(SAMRecordToGATKReadAdapter.java:92)
	at org.broadinstitute.hellbender.utils.clipping.ClippingOp.applyHARDCLIP_BASES(ClippingOp.java:381)
	at org.broadinstitute.hellbender.utils.clipping.ClippingOp.apply(ClippingOp.java:73)
	at org.broadinstitute.hellbender.utils.clipping.ReadClipper.clipRead(ReadClipper.java:145)
	at org.broadinstitute.hellbender.utils.clipping.ReadClipper.clipRead(ReadClipper.java:126)
	at org.broadinstitute.hellbender.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:330)
	at org.broadinstitute.hellbender.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:333)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.finalizeRegion(AssemblyBasedCallerUtils.java:82)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.assembleReads(AssemblyBasedCallerUtils.java:242)
	at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:211)
	at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:245)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:317)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:293)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:838)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:137)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:158)
	at org.broadinstitute.hellbender.Main.main(Main.java:239)


Created 2017-11-06 | Updated 2017-11-08 | Barclay bug


If you specify an input argument that is a file that is auto-expanded by Barclay, the entire contents of the expanded list is included wherever command lines are displayed, such as on the command line and in output files (PG records, etc.).



Created 2017-11-02 | Updated 2017-11-07 | bug Engine performance


I'm running SelectVariants with an interval list L of 600,000 loci and an excluded interval list XL of 160 loci. Based on restricting L and XL it seems that the runtime scales as |L|x|XL|, which comes out to over an hour. The runtime does not depend at all on the size of the vcf in the -V argument. That is, once traversal begins it's basically instantaneous, but processing the intervals takes forever.

Naively, one could make this scale as |L| + |XL| by querying each variant in the input vcf against L and then against XL. For reasons @cmnbroad points out this is not a great solution, but it seems like because (i) L and XL are ordered and (ii) each L overlaps with only a sparse subset of XL and vice versa that this quadratic scaling is not necessary.

For my case I can get around this easily by preprocessing V with -XL, then running SelectVariants with -L.

@lbergelson was also in this conversation.



Created 2017-10-31 | Updated 2017-11-01 | bug


Results in a warning message.



Created 2017-10-23 | Updated 2017-10-23 | bug GenomicsDB


Hi all;
I'm running to a segfault issue with GATK4 beta6 when running GenomicsDBImport on some batches. This is a small self contained test case that demonstrates the problem:

https://s3.amazonaws.com/chapmanb/testcases/gatk/gatk4_genomicsdb_segfault.tar.gz

When running:

gatk-launch --javaOptions '-Xms1g -Xmx2g' GenomicsDBImport --genomicsDBWorkspace fails_genomicsdb -L chr6:130365070-146544250 --variant NA12878.vcf.gz --variant NA24631.vcf.gz --variant NA24385.vcf.gz

It appears to segfault in jniImportBatch:

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.intel.genomicsdb.GenomicsDBImporter.jniImportBatch(J[J)Z+0
j  com.intel.genomicsdb.GenomicsDBImporter.importBatch()Z+160
j  org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.traverse()V+301
j  org.broadinstitute.hellbender.engine.GATKTool.doWork()Ljava/lang/Object;+12
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool()Ljava/lang/Object;+27
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs()Ljava/lang/Object;+431
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain([Ljava/lang/String;)Ljava/lang/Object;+14
j  org.broadinstitute.hellbender.Main.runCommandLineProgram(Lorg/broadinstitute/hellbender/cmdline/CommandLineProgram;[Ljava/lang/String;)Ljava/lang/Object;+20
j  org.broadinstitute.hellbender.Main.mainEntry([Ljava/lang/String;)V+19
j  org.broadinstitute.hellbender.Main.main([Ljava/lang/String;)V+8
v  ~StubRoutines::call_stub

The same command works without the NA24385.vcf.gz sample but it wasn't clear what caused the issue from this input. I'm also seeing similar behavior over a few other regions and guess they're all caused by the same underlying issue.

Thanks much for any pointers or ideas and please let me know if any other information would be useful.



Created 2017-10-16 | Updated 2017-10-25 | bug HaplotypeCaller forum


Bug Report

Affected tool(s)

HaplotypeCaller

Affected version(s)

GATK4.beta5

Description

HaplotypeCaller does not make some calls depending on the padding size around the interval of interest. The variant calls should not be dependent on the interval size. For example,with -ip 50, I get 7 variant calls. But with -ip 150, I get only 2 variant calls. It seems to be an issue with the graph assembly (perhaps due to repeat regions), but adding --allowNonUniqueKmersInRef does not help. In the IGV screenshots below, the top is the original BAM file; the second is the bamout with -ip 50; the third is with -ip 100; the fourth is with -ip 150; the fifth is with -ip 200.

screen shot 2017-10-16 at 12 36 32 pm

Notice the difference in calls between -ip 50 and -ip 150. The call should be made regardless of -ip.

Steps to reproduce

Files are here:
/humgen/gsa-scr1/schandra/SkyWarrior_HCMissingCalls/GATK_Bugsubmit_10448_haplotypecaller-missing-snp-calls

Commands:
gatk-4.beta.5/gatk-launch HaplotypeCaller -R reference/hg19_ref-ym.fa -I MLC1_Exome_Depth208.bam -L region.bed -O Sheila.HaplotypeCaller.vcf

gatk-4.beta.5/gatk-launch HaplotypeCaller -R reference/hg19_ref-ym.fa -I MLC1_Exome_Depth208.bam -L region.bed -O Sheila.HaplotypeCaller.50.vcf -ip 50

gatk-4.beta.5/gatk-launch HaplotypeCaller -R reference/hg19_ref-ym.fa -I MLC1_Exome_Depth208.bam -L region.bed -O Sheila.HaplotypeCaller.100.vcf -ip 100

gatk-4.beta.5/gatk-launch HaplotypeCaller -R reference/hg19_ref-ym.fa -I MLC1_Exome_Depth208.bam -L region.bed -O Sheila.HaplotypeCaller.150.vcf -ip 150

gatk-4.beta.5/gatk-launch HaplotypeCaller -R reference/hg19_ref-ym.fa -I MLC1_Exome_Depth208.bam -L region.bed -O Sheila.HaplotypeCaller.200.vcf -ip 200

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/10448/haplotypecaller-missing-snp-calls/p1



Created 2017-10-11 | Updated 2017-10-11 | bug Engine enhancement


Despite that spark supports equals signs in their configuration values the current code in this class does not as it splits the name=value string in all "=" present resulting in an bad-argument value exception.



Created 2017-10-06 | Updated 2017-10-17 | bug GenomicsDB


Currently GenomicsDB's GenomicsDBImporter.generateVidMapFromMergedHeader only includes specific subclasses of VCFHeaderLine. It's missing support for VCFHeaderLine and SimpleVCFHeaderLine. These header lines should be handled and propagated to the output.



Created 2017-10-04 | Updated 2017-11-01 | bug Copy Number tools HaplotypeCaller Spark


I'm running HaplotypeCallerSpark with GATK4 beta 5 and running into an issue where it complains about a reference name lookup, reporting that a strange reference name is not found:

java.lang.IllegalArgumentException: Reference name for '1145119298' not found in sequence dictionary.

This happens with HaplotypeCallerSpark independent of the number of cores or memory requested but works fine with HaplotypeCaller using the same BED and BAM inputs. The BED file has a number of regions on chr15, chr16 and chr17 and I'm happy to share anything that would be useful. I'm still working on generating a minimal example that demonstrates the issue.

The command we're running is:

gatk-launch --javaOptions '-Xms1000m -Xmx46965m -Djava.io.tmpdir=/mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2017-10-03-104521.457/root/variantcall/8/variantcall_batch_region/16/bcbiotx/tmp7exzBH' HaplotypeCallerSpark --reference /mnt/work/cwl/bcbio_validation_workflows/giab-joint/biodata/collections/hg38/ucsc/hg38.2bit --annotation FisherStrand --annotation MappingQualityRankSumTest --annotation MappingQualityZero --annotation QualByDepth --annotation ReadPosRankSumTest --annotation RMSMappingQuality --annotation BaseQualityRankSumTest --annotation MappingQuality --annotation DepthPerAlleleBySample --annotation Coverage -I /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2017-10-02-100534.907/root/postprocess_alignment/2/align/NA24631/NA24631-sort-recal.bam -L /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2017-10-03-104521.457/root/variantcall/8/variantcall_batch_region/16/gatk-haplotype/chr15/NA24631-chr15_68578892_84670250-block-regions.bed --interval_set_rule INTERSECTION --sparkMaster local[16] --conf spark.local.dir=/mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2017-10-03-104521.457/root/variantcall/8/variantcall_batch_region/16/bcbiotx/tmp7exzBH --annotation ClippingRankSumTest --annotation DepthPerSampleHC --output /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2017-10-03-104521.457/root/variantcall/8/variantcall_batch_region/16/bcbiotx/tmp7exzBH/NA24631-chr15_68578892_84670250-block.vcf.gz --emitRefConfidence GVCF -GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 60 -GQB 80

and the full traceback is:

	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
	at org.apache.spark.scheduler.Task.run(Task.scala:86)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
	at scala.Option.foreach(Option.scala:257)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1899)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1913)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
	at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:264)
	at org.apache.spark.RangePartitioner.<init>(Partitioner.scala:126)
	at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:62)
	at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:61)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
	at org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:61)
	at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:913)
	at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:903)
	at org.broadinstitute.hellbender.utils.spark.SparkUtils.coordinateSortReads(SparkUtils.java:130)
	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.callVariantsWithHaplotypeCallerAndWriteOutput(HaplotypeCallerSpark.java:163)
	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.runTool(HaplotypeCallerSpark.java:125)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:353)
	at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
	at org.broadinstitute.hellbender.Main.main(Main.java:233)
Caused by: java.lang.IllegalArgumentException: Reference name for '1145119298' not found in sequence dictionary.
	at htsjdk.samtools.SAMRecord.resolveNameFromIndex(SAMRecord.java:569)
	at htsjdk.samtools.SAMRecord.setReferenceIndex(SAMRecord.java:422)
	at htsjdk.samtools.BAMRecord.<init>(BAMRecord.java:87)
	at htsjdk.samtools.DefaultSAMRecordFactory.createBAMRecord(DefaultSAMRecordFactory.java:42)
	at htsjdk.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:210)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:829)
	at htsjdk.samtools.BAMFileReader$BAMFileIndexIterator.getNextRecord(BAMFileReader.java:981)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:803)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:797)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:765)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.advance(BAMFileReader.java:1034)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.<init>(BAMFileReader.java:1003)
	at htsjdk.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:944)
	at org.seqdoop.hadoop_bam.BAMRecordReader.initialize(BAMRecordReader.java:169)
	at org.seqdoop.hadoop_bam.BAMInputFormat.createRecordReader(BAMInputFormat.java:121)
	at org.seqdoop.hadoop_bam.AnySAMInputFormat.createRecordReader(AnySAMInputFormat.java:190)
	at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:170)
	at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:130)
	at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:67)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
	at org.apache.spark.scheduler.Task.run(Task.scala:86)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Thanks for any suggestions or pointers to debug further.



Created 2017-10-02 | Updated 2017-10-03 | Barclay bug


After broadinstitute/barclay@78b1711, ReadFilters are documented without the arguments inside of them, even if they can be instantiated. This is Barclay bug, but it should be solved to have a complete documentation.



Created 2017-10-01 | Updated 2017-10-03 | bug SV


The code that composes the result output folder depends on git batch --contains HASH to pick up a line with a standard branch name (e.g. joe_doe_bugfix)

However this is not neceserely the case if the current checkout is not attach to a local branch... for example when one does git fetch; git checkout origin/master. In that case a typical git-batch line that gets picked up is * (HEAD detached at origin/master) and in this case it will use "origin/master)" rather than "joe_doe_bugfix" to be part of the result output directory name.

The problem is the "/" and ")" which causes problems later at least when running copy_sv_results.sh as they are not escaped appropriately.

Obvious ways to address this:

  1. remove that component of the output name as is not needed to make it quite unique.
  2. change the sub-command to handle that situation.
  3. or fail early (before spinning the cluster) if the GATK git checkout is detached.


Created 2017-09-24 | Updated 2017-11-17 | bug GKL NativeLibraries


I'm running into a consistent core dump in GATK 4 beta 5 (GKL 0.5.8) related to deflation with the Intel Genomics Library. This occurs on a AWS m4.4xlarge machine running Ubuntu 16.04 and consistently core dumps and provides this stack trace:

https://gist.github.com/chapmanb/006c1c9abeb21e9baf244d17d7ae1003

Running ApplyBQSR:

unset JAVA_HOME && export PATH=/mnt/work/bcbio/anaconda/bin:$PATH && gatk-launch --javaOptions '-Xms1000m -Xmx46965m -XX:+UseSerialGC -Djava.io.tmpdir=/mnt/work/cwl/bcbio_validation_workflows/somatic-giab-mix/bunny_work/main-somatic-giab-mix-2017-09-23-094842.494/root/postprocess_alignment/2/bcbiotx/tmpLCoup3' ApplyBQSRSpark --sparkMaster local[16] --input /mnt/work/cwl/bcbio_validation_workflows/somatic-giab-mix/bunny_work/main-somatic-giab-mix-2017-09-22-201054.451/root/alignment/2/merge_split_alignments/align/giab-mix-tumor/giab-mix-tumor-sort.bam --output /mnt/work/cwl/bcbio_validation_workflows/somatic-giab-mix/bunny_work/main-somatic-giab-mix-2017-09-23-094842.494/root/postprocess_alignment/2/bcbiotx/tmpLCoup3/giab-mix-tumor-sort-recal.bam --bqsr_recal_file /mnt/work/cwl/bcbio_validation_workflows/somatic-giab-mix/bunny_work/main-somatic-giab-mix-2017-09-23-094842.494/root/postprocess_alignment/2/align/giab-mix-tumor/giab-mix-tumor-sort-recal.grp

Adding --use_jdk_deflater to the ApplyBQSR command line avoids the issue.

I'm not sure if the java stack dump and command line provide enough information to be useful or if having a reproducible case is needed. The case above reproduces but has fairly large BAM files and I haven't been able to get a more minimal case, but I could prepare and share if it would be helpful. Thanks much for looking at this.



Created 2017-08-28 | Updated 2017-10-17 | bug HaplotypeCaller forum


Bug Report

Affected tool(s)

HaplotypeCaller

Affected version(s)

latest GATK4

Description

HaplotypeCaller misses calls at the ends of reads that UnifedGenotyper picks up in amplicon data. For example, UnifiedGenotyper calls the indel at the ends of the reads, but HaplotypeCaller does not.
screen shot 2017-08-28 at 4 11 27 pm
The top is the original BAM file and the bottom is the bamout file.

Steps to reproduce

The files are here: /humgen/gsa-scr1/schandra/ulitskyi_MissedCall

Command for UnifiedGenotyper:
java -jar /humgen/gsa-hpprojects/GATK/bin/current/GenomeAnalysisTK.jar -T UnifiedGenotyper -R hg19.unmasked.fa -I snippet4.bam -o Sheila.UnifiedGenotyper.vcf -glm BOTH

Command for HaplotypeCaller:
/humgen/gsa-scr1/public/gatk-4.beta.3-SNAPSHOT/gatk-launch HaplotypeCaller -R hg19.unmasked.fa -I snippet4.bam -O Sheila.HaplotypeCaller.vcf -dontTrimActiveRegions --disableOptimizations

Expected behavior

The indel should be called.

Actual behavior

Indel is not called, but is called by UnifiedGenotyper.

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/comment/41504#Comment_41504



Created 2017-08-25 | Updated 2017-08-28 | bug SV


Fails with an ArrayIndexOutOfBoundException.

The fix is trivial.

This might affect other similar methods in that class



Created 2017-08-23 | Updated 2017-10-17 | bug Engine performance forum


User has reported longer runtimes in GATK4 beta3 release compared to GATK4 beta 2 release. It sounds like this is not expected. Her runtimes are below. The first post in the forum thread has her original report.

Tool 4.beta.2 4.beta.3
BaseRecalibrator 1m 3s 3m 3s
ApplyBQSR (scattered) 4m 48s 11m 51s
HaplotypeCaller (scattered) 23m 42s 29m 7s
GenotypeGVCFs (scattered) 4m 6s 9m 28s
VariantRecalibrator (for SNPs) 4m 7s 6m 38s
VariantRecalibrator (for INDELs) 2m 7s 4m 8s
ApplyVQSR (for SNPs) 37s 2m 36s
ApplyVQSR (for INDELs) 39s 2m 35s

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/comment/41669#Comment_41669



Created 2017-08-22 | Updated 2017-10-17 | bug NIO PRIORITY_HIGH


I ran 408 invocations of an nio using command for BQSR using gatk4 and got 2 failures that looked pretty similar. Is there something I might be doing wrong? The two failures were also on different shards.

I cant remember exactly when I built this jar but it was after this commit - 4df1d16 less than a week ago. If you need any more info let me know, thanks.

Using GATK jar /usr/gitc/gatk4/gatk-package-4.beta.3-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Dsnappy.disable=true -XX:+PrintFlagsFinal -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:gc_log.log -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xms3000m -jar /usr/gitc/gatk4/gatk-package-4.beta.3-local.jar ApplyBQSR --createOutputBamMD5 --addOutputSAMProgramRecord -R /cromwell_root/broad-references/hg38/v0/Homo_sapiens_assembly38.fasta -I gs://broad-gotc-dev-cromwell-execution/PairedEndSingleSampleWorkflow/4a87f12f-014e-438a-9a10-260c70bf3584/call-SortSampleBam/attempt-4/NA12878.aligned.duplicate_marked.sorted.bam --useOriginalQualities -O NA12878.aligned.duplicates_marked.recalibrated.bam -bqsr /cromwell_root/broad-gotc-dev-cromwell-execution/PairedEndSingleSampleWorkflow/4a87f12f-014e-438a-9a10-260c70bf3584/call-GatherBqsrReports/NA12878.recal_data.csv -SQQ 10 -SQQ 20 -SQQ 30 -L chr5:1+
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.Ni4zSL
[August 22, 2017 2:52:59 PM UTC] ApplyBQSR  --output NA12878.aligned.duplicates_marked.recalibrated.bam --bqsr_recal_file /cromwell_root/broad-gotc-dev-cromwell-execution/PairedEndSingleSampleWorkflow/4a87f12f-014e-438a-9a10-260c70bf3584/call-GatherBqsrReports/NA12878.recal_data.csv --useOriginalQualities true --static_quantized_quals 10 --static_quantized_quals 20 --static_quantized_quals 30 --intervals chr5:1+ --input gs://broad-gotc-dev-cromwell-execution/PairedEndSingleSampleWorkflow/4a87f12f-014e-438a-9a10-260c70bf3584/call-SortSampleBam/attempt-4/NA12878.aligned.duplicate_marked.sorted.bam --reference /cromwell_root/broad-references/hg38/v0/Homo_sapiens_assembly38.fasta --createOutputBamMD5 true --addOutputSAMProgramRecord true  --preserve_qscores_less_than 6 --quantize_quals 0 --round_down_quantized false --emit_original_quals false --globalQScorePrior -1.0 --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --interval_merging_rule ALL --readValidationStringency SILENT --secondsBetweenProgressUpdates 10.0 --disableSequenceDictionaryValidation false --createOutputBamIndex true --createOutputVariantIndex true --createOutputVariantMD5 false --lenient false --addOutputVCFCommandLine true --cloudPrefetchBuffer 40 --cloudIndexPrefetchBuffer -1 --disableBamIndexCaching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --gcs_max_retries 20 --disableToolDefaultReadFilters false
[August 22, 2017 2:52:59 PM UTC] Executing as root@fb0704c97258 on Linux 4.9.0-0.bpo.3-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_111-8u111-b14-2~bpo8+1-b14; Version: 4.beta.3-41-g9d05dd8-SNAPSHOT
[August 22, 2017 3:06:11 PM UTC] org.broadinstitute.hellbender.tools.walkers.bqsr.ApplyBQSR done. Elapsed time: 13.19 minutes.
Runtime.totalMemory()=3040870400
htsjdk.samtools.FileTruncatedException: Premature end of file: /PairedEndSingleSampleWorkflow/4a87f12f-014e-438a-9a10-260c70bf3584/call-SortSampleBam/attempt-4/NA12878.aligned.duplicate_marked.sorted.bam
	at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:530)
	at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
	at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
	at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
	at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:331)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at htsjdk.samtools.util.BinaryCodec.readBytesOrFewer(BinaryCodec.java:404)
	at htsjdk.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:380)
	at htsjdk.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:366)
	at htsjdk.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:209)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:829)
	at htsjdk.samtools.BAMFileReader$BAMFileIndexIterator.getNextRecord(BAMFileReader.java:981)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:803)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:797)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:765)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.advance(BAMFileReader.java:1034)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:1024)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:988)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
	at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.loadNextRecord(SamReaderQueryingIterator.java:114)
	at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.next(SamReaderQueryingIterator.java:151)
	at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.next(SamReaderQueryingIterator.java:29)
	at org.broadinstitute.hellbender.utils.iterators.SAMRecordToReadIterator.next(SAMRecordToReadIterator.java:29)
	at org.broadinstitute.hellbender.utils.iterators.SAMRecordToReadIterator.next(SAMRecordToReadIterator.java:15)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.ReadWalker.traverse(ReadWalker.java:94)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:838)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
	at org.broadinstitute.hellbender.Main.main(Main.java:233)


Created 2017-08-17 | Updated 2017-09-05 | bug SV


It seems that it fails to hard-clip the bases array if the cigar has hard-clips.

java.lang.IllegalStateException: CIGAR covers 4 bases but the sequence is 8 read bases 

	at org.broadinstitute.hellbender.utils.bwa.BwaMemAlignmentUtils.applyAlignment(BwaMemAlignmentUtils.java:92)
	at org.broadinstitute.hellbender.tools.spark.sv.discovery.AlignmentIntervalUnitTest.testConstructionFromGATKRead(AlignmentIntervalUnitTest.java:133)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:108)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:661)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:869)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1193)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:126)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
	at org.testng.TestRunner.privateRun(TestRunner.java:744)
	at org.testng.TestRunner.run(TestRunner.java:602)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:380)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:375)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
	at org.testng.SuiteRunner.run(SuiteRunner.java:289)
	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1301)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:1226)
	at org.testng.TestNG.runSuites(TestNG.java:1144)
	at org.testng.TestNG.run(TestNG.java:1115)



Created 2017-08-14 | Updated 2017-08-18 | bug


The TargetCoverageSexGenotyper can sometimes fail to calculate genotype LLs for low-quality (low-coverage) samples. The exception below can be reproduced trivially: just pass it a raw coverage matrix containing a single sample that has all only zeroes in its counts column. In practice, I've recently encountered several samples that somehow passed upstream QC but failed to genotype due to extensive dropout. This causes program termination and no genotypes are written for any samples in the input. Could we instead warn and print an NA genotype for such samples?

Runtime.totalMemory()=869269504
org.apache.commons.math3.exception.NotStrictlyPositiveException: mean (0)
	at org.apache.commons.math3.distribution.PoissonDistribution.<init>(PoissonDistribution.java:126)
	at org.apache.commons.math3.distribution.PoissonDistribution.<init>(PoissonDistribution.java:103)
	at org.apache.commons.math3.distribution.PoissonDistribution.<init>(PoissonDistribution.java:80)
	at org.broadinstitute.hellbender.tools.exome.sexgenotyper.TargetCoverageSexGenotypeCalculator.lambda$calculateSexGenotypeData$20(TargetCoverageSexGenotypeCalculator.java:414)```


Created 2017-08-04 | Updated 2017-10-17 | bug GenomicsDB


We've seen 2 instances of this error now:

java.lang.ArrayIndexOutOfBoundsException: 15
	at htsjdk.variant.bcf2.BCF2Utils.decodeType(BCF2Utils.java:122)
	at htsjdk.variant.bcf2.BCF2Decoder.decodeInt(BCF2Decoder.java:220)
	at htsjdk.variant.bcf2.BCF2Decoder.decodeNumberOfElements(BCF2Decoder.java:205)
	at htsjdk.variant.bcf2.BCF2Decoder.decodeTypedValue(BCF2Decoder.java:129)
	at htsjdk.variant.bcf2.BCF2Decoder.decodeTypedValue(BCF2Decoder.java:125)
	at htsjdk.variant.bcf2.BCF2LazyGenotypesDecoder.parse(BCF2LazyGenotypesDecoder.java:75)
	at htsjdk.variant.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:158)
	at htsjdk.variant.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:148)
	at htsjdk.variant.variantcontext.GenotypesContext.getMaxPloidy(GenotypesContext.java:431)
	at htsjdk.variant.variantcontext.VariantContext.getMaxPloidy(VariantContext.java:785)
	at org.broadinstitute.hellbender.tools.walkers.ReferenceConfidenceVariantContextMerger.mergeRefConfidenceGenotypes(ReferenceConfidenceVariantContextMerger.java:405)
	at org.broadinstitute.hellbender.tools.walkers.ReferenceConfidenceVariantContextMerger.merge(ReferenceConfidenceVariantContextMerger.java:92)
	at org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs.apply(GenotypeGVCFs.java:212)
	at org.broadinstitute.hellbender.engine.VariantWalkerBase.lambda$traverse$0(VariantWalkerBase.java:110)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.VariantWalkerBase.traverse(VariantWalkerBase.java:108)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:838)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:116)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:173)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:192)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
	at org.broadinstitute.hellbender.Main.main(Main.java:233)

It seems to happen in rare cases when reading from a genomicsDB instance with GenotypeGVCFs. The array that is going out of bounds is one that maps ID values to BCF2Types. The valid values for BCF2Types are 0,1,2,3,5,7 as defined in the bcf spec section 6.3.3 It seems likely that this is either a bug in htsjdk, in htsjlib, or in GenomicsDB.



Created 2017-07-26 | Updated 2017-07-27 | bug Picard


I think CollectSequencingArtifactMetrics requires a reference, but the documentation lists it as optional. Without reference, I get an instant crash. I also get NullPointer exceptions when proving --DB_SNP.

Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Dsnappy.disable=true -jar /home/riestma1/gatk-4.beta.3/gatk-package-4.beta.3-local.jar CollectSequencingArtifactMetrics --input sample1.bam --output sample1_pre_adapter_detail_metrics
 ...
18:06:26.220 INFO  CollectSequencingArtifactMetrics - Shutting down engine
[July 26, 2017 6:06:26 PM EDT] org.broadinstitute.hellbender.tools.picard.analysis.artifacts.CollectSequencingArtifactMetrics done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=1517289472
java.lang.NullPointerException
	at org.broadinstitute.hellbender.tools.picard.analysis.artifacts.CollectSequencingArtifactMetrics.acceptRead(CollectSequencingArtifactMetrics.java:214)
	at org.broadinstitute.hellbender.tools.picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:114)
	at org.broadinstitute.hellbender.tools.picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:116)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:173)
	at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgram.instanceMain(PicardCommandLineProgram.java:62)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
	at org.broadinstitute.hellbender.Main.main(Main.java:233)


Created 2017-07-18 | Updated 2017-07-18 | bug Mutect


Using VCFTools to validate:

vcf-validator SAMPLE7T-vs-SAMPLE7N-filtered.vcf

I get a massive amount of messages:

.....snip.......
column SAMPLE7N at 19:49136721 .. Could not validate the float [NaN],FORMAT tag [MPOS] expected different number of values (expected 1, found 2),FORMAT tag [MFRL] expected different number of values (expected 1, found 2),FORMAT tag [MMQ] expected different number of values (expected 1, found 2),FORMAT tag [MCL] expected different number of values (expected 1, found 2),FORMAT tag [MBQ] expected different number of values (expected 1, found 2)
.....snip.......
column SAMPLE7T at 19:45901415 .. FORMAT tag [MBQ] expected different number of values (expected 1, found 2),FORMAT tag [MMQ] expected different number of values (expected 1, found 2),FORMAT tag [MCL] expected different number of values (expected 1, found 2),FORMAT tag [MFRL] expected different number of values (expected 1, found 2),FORMAT tag [MPOS] expected different number of values (expected 1, found 2)
.....snip.......

Sure enough, the header does not match the values for those fields (in the header number="A"), so the validation errors are correct. Not sure what is the deal with FOXOG, but that may not be a big deal.



Created 2017-07-11 | Updated 2017-07-12 | bug


I tried to run VariantRecalibrator using the args echoed from an integration test and found that the resource files weren't listed properly. The command in the test was " --resource known,known=true,prior=10.0:" + getLargeVQSRTestDataDir() + "dbsnp_132_b37.leftAligned.20.1M-10M.vcf" and what came out of the engine was --resource known:/Users/gauthier/workspaces/gatk/src/test/resources/large/VQSR/dbsnp_132_b37.leftAligned.20.1M-10M.vcf, so it lost the known=true and the prior which makes the command line unrunnable. Probably affects #2269 too.

This behavior can be replicated by running any of the VariantRecalibration integration tests and checking the console output.



Created 2017-07-02 | Updated 2017-10-17 | bug NativeLibraries


We are currently plagued with cryptic intermittent failures coming from the BWA and FML bindings in Travis CI. These usually manifest as a simple "exited with code 137" (ie., killed by signal 9) error, but sometimes we get an explicit segfault or out-of-memory error. Examples:

�[31mFAILURE: �[39m�[31mBuild failed with an exception.�[39m
* What went wrong:
Execution failed for task ':test'.
�[33m> �[39mProcess 'Gradle Test Executor 1' finished with non-zero exit value 137
:test[M::bwa_idx_load_from_disk] read 0 ALT contigs
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000715180000, 719847424, 0) failed; error='Cannot allocate memory' (errno=12)

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 719847424 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/travis/build/broadinstitute/gatk/hs_err_pid11513.log
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f27ebfe7d9a, pid=11455, tid=0x00007f27e87e5700
#
# JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build 1.8.0_111-8u111-b14-3~14.04.1-b14)
# Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libfml.6198146539708364717.jnilib+0xed9a]  rld_itr_init+0x4a
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fd2680a350c, pid=11685, tid=0x00007fd2b02bf700
#
# JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build 1.8.0_111-8u111-b14-3~14.04.1-b14)
# Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libbwa.5694772191018335324.jnilib+0x850c]  bwa_mem2idx+0xcc

The underlying issue in these cases is likely either "out of memory" or, perhaps in the case of the seg faults, "file not found" or "malformed file", but we could greatly improve our ability to interpret Travis failures if we were more careful about checking return values from system calls. Eg., in the function below from the BWA bindings we could check the return values of the mmap() and calloc() calls, and die with an appropriate error message if they fail:

bwaidx_t* jnibwa_openIndex( int fd ) {
    struct stat statBuf;
    if ( fstat(fd, &statBuf) == -1 ) return 0;
    uint8_t* mem = mmap(0, statBuf.st_size, PROT_READ, MAP_SHARED, fd, 0);
    close(fd);
    bwaidx_t* pIdx = calloc(1, sizeof(bwaidx_t));
    bwa_mem2idx(statBuf.st_size, mem, pIdx);
    pIdx->is_shm = 1;
    mem_fmt_fnc = &fmt_BAMish;
    bwa_verbose = 0;
    return pIdx;
}


Priority: High   | Normal