Base Quality Score Recalibration

  • Multi-threaded support in the BaseRecalibrator tool has been temporarily suspended for performance reasons; we hope to have this fixed for the next release.
  • Implemented support for SOLiD no call strategies other than throwing an exception.
  • Fixed smoothing in the BQSR bins.
  • Fixed plotting R script to be compatible with newer versions of R and ggplot2 library.

Unified Genotyper

  • Renamed the per-sample ML allelic fractions and counts so that they don't have the same name as the per-site INFO fields, and clarified the description in the VCF header.
  • UG now makes use of base insertion and base deletion quality scores if they exist in the reads (output from BaseRecalibrator).
  • Changed the -maxAlleles argument to -maxAltAlleles to make it more accurate.
  • In pooled mode, if haplotypes cannot be created from given alleles when genotyping indels (e.g. too close to contig boundary, etc.) then do not try to genotype.
  • Added improvements to indel calling in pooled mode: we compute per-read likelihoods in reference sample to determine whether a read is informative or not.

Haplotype Caller

  • Added LowQual filter to the output when appropriate.
  • Added some support for calling on Reduced Reads. Note that this is still experimental and may not always work well.
  • Now does a better job of capturing low frequency branches that are inside high frequency haplotypes.
  • Updated VQSR to work with the MNP and symbolic variants that are coming out of the HaplotypeCaller.
  • Made fixes to the likelihood based LD calculation for deciding when to combine consecutive events.
  • Fixed bug where non-standard bases from the reference would cause errors.
  • Better separation of arguments that are relevant to the Unified Genotyper but not the Haplotype Caller.

Reduce Reads

  • Fixed bug where reads were soft-clipped beyond the limits of the contig and the tool was failing with a NoSuchElement exception.
  • Fixed divide by zero bug when downsampler goes over regions where reads are all filtered out.
  • Fixed a bug where downsampled reads were not being excluded from the read window, causing them to trail back and get caught by the sliding window exception.

Variant Eval

  • Fixed support in the AlleleCount stratification when using the MLEAC (it is now capped by the AN).
  • Fixed incorrect allele counting in IndelSummary evaluation.

Combine Variants

  • Now outputs the first non-MISSING QUAL, instead of the maximum.
  • Now supports multi-threaded running (with the -nt argument).

Select Variants

  • Fixed behavior of the --regenotype argument to do proper selecting (without losing any of the alternate alleles).
  • No longer adds the DP INFO annotation if DP wasn't used in the input VCF.
  • If MLEAC or MLEAF is present in the original VCF and the number of samples decreases, remove those annotations from the output VC (since they are no longer accurate).

Miscellaneous

  • Updated and improved the BadCigar read filter.
  • GATK now generates a proper error when a gzipped FASTA is passed in.
  • Various improvements throughout the BCF2-related code.
  • Removed various parallelism bottlenecks in the GATK.
  • Added support of X and = CIGAR operators to the GATK.
  • Catch NumberFormatExceptions when parsing the VCF POS field.
  • Fixed bug in FastaAlternateReferenceMaker when input VCF has overlapping deletions.
  • Fixed AlignmentUtils bug for handling Ns in the CIGAR string.
  • We now allow lower-case bases in the REF/ALT alleles of a VCF and upper-case them.
  • Added support for handling complex events in ValidateVariants.
  • Picard jar remains at version 1.67.1197.
  • Tribble jar remains at version 110.

Comment on this article in the forum



At a glance



Follow us on Twitter

GATK Dev Team

@gatk_dev

@BrianPardy Great, thanks for the feedback!
28 Sep 16
@BrianPardy Thank you! Does anything in particular stand out? We'd love to know what people find most useful so we can do more of the same.
28 Sep 16
#GATK workshop crew is in Basel, ready to roll! @ISBSIB https://t.co/m56JzpC1bN
25 Sep 16
@thatdnaguy That's right, we've retired it, see https://t.co/epbvwOQVTt
23 Sep 16
@geoffjentry @BroadGenomics Ah, you should ask @WDL_dev on the WDL forum then :)
21 Sep 16

Our favorite tweets from others

I've easily written my first custom ReadFilter for GATK. The @gatk_dev 's toolkit is a great example of programming.
21 Sep 16
@gatk_dev "make it so"
8 Sep 16
it's the nightly build owl for GATK :D https://t.co/OwTRrk6KHA https://t.co/rfmAbdIIQp
11 Aug 16
We're going to make an hg38 version of ExAC. And we'll make @dgmacarthur pay for it. #BioinformaticsCampaignPromises
2 Aug 16
You’re right @gatk_dev honesty is key! About variants manual filtering: “In any case you're probably in for a world of pain.” Ha now I know!
11 Jul 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg benchmarks best-practices bug bug-fixed cancer catvariants cloud cluster commandline commandlinegatk community compute conferences cram cromwell denovo depthofcoverage diagnosetargets error fix forum gatk3 gatk4 genotype genotype-refinement genotypegvcfs google grch38 gvcf haploid haplotypecaller hg38 holiday hts htsjdk ibm java8 job job-offer jobs license meetings mendelianviolations multithreading mutect mutect2 ngs nt outreach pairhmm parallelism patch performance phone-home picard pipeline plans ploidy polyploid poster presentations printreads profile promote reference-model release release-notes rnaseq runtime saas script search selectvariants sequencing service slides snow speed status sting support syntax talks team terminology third-party-tools topstory trivia troll tutorial unifiedgenotyper variantannotator variantrecalibrator vcf-gz version-highlights versions vqsr wdl webinar workflow workshop