GATK HaplotypeCaller reports nonsense GT/AD when --min-base-quality-score parameter changes
open | Created 2019-07-16 | Last updated 2019-09-06| Posted by freeseek | See in Github

HaplotypeCaller assembly bug

I have managed to generate a minimal bam file that reproduces the issue.

First of all, you have to download the mini input.bam file from this dropbox link:

Then the following code will reproduce the issue:



wget -O- | \
  gzip -d > GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

samtools faidx GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

java -jar picard.jar \
  CreateSequenceDictionary \
  R=GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \

(echo "##fileformat=VCFv4.2"; \
echo "##contig=<ID=chrX,length=156040895>"; \
echo -e "chr1\t97329945\t.\tT\tA\t.\t.\t."; \
echo -e "chr1\t97329967\t.\tC\tT\t.\t.\t.") | bgzip > input.vcf.gz && \
tabix -f input.vcf.gz

for score in 11 12; do
  gatk- HaplotypeCaller \
    -R GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
    -I input.bam \
    -O output.$score.vcf.gz \
    --genotyping-mode GENOTYPE_GIVEN_ALLELES \
    --alleles input.vcf.gz \
    -L chr1:97329945-97329967 \
    --min-base-quality-score $score && \
  bcftools query \
    -f "[%CHROM\t%POS\t%REF\t%ALT\t%GT\t%AD\n]" \
    output.$score.vcf.gz \
    -r chr1:97329945-97329967

When the parameter --min-base-quality-score 11 is used, the GT/AD output is this:

chr1	97329945	T	A	1/1	0,35
chr1	97329967	C	T	1/1	0,33

When the parameter --min-base-quality-score 12 is used, the GT/AD output is this:

chr1	97329945	T	A	0/1	9,10
chr1	97329967	C	T	0/1	6,11

The first output is the output that makes sense. When --min-base-quality-score 12 is used, it is as if HaplotypeCaller invents some reference allele reads and then uses them to genotype the variant as heterozygous. I have seen issues like this all over the genome.

Interestingly, if I run the same code on my own Ubuntu laptop, I get an error instead:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Inconsistent constant pool data in classfile for class org/broadinstitute/hellbender/transformers/ReadTransformer. Method lambda$identity$d67512bf$1(Lorg/broadinstitute/hellbender/utils/read/GATKRead;)Lorg/broadinstitute/hellbender/utils/read/GATKRead; at index 65 is CONSTANT_MethodRef and should be CONSTANT_InterfaceMethodRef
	at org.broadinstitute.hellbender.transformers.ReadTransformer.identity(
	at org.broadinstitute.hellbender.engine.GATKTool.makePreReadFilterTransformer(
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(
	at org.broadinstitute.hellbender.Main.mainEntry(
	at org.broadinstitute.hellbender.Main.main(

Return to top