## Version highlights for GATK version 3.2

### Posted by Geraldine_VdAuwera on 30 Jul 2014 (2)

Better late than never (right?), here are the version highlights for GATK 3.2. Overall, this release is essentially a collection of bug fixes and incremental improvements that we wanted to push out to not keep folks waiting while we're working on the next big features. Most of the bug fixes are related to the HaplotypeCaller and its "reference confidence model" mode (which you may know as -ERC GVCF). But there are also a few noteworthy improvements/changes in other tools which I'll go over below.

### Working out the kinks in the "reference confidence model" workflow

The "reference confidence model" workflow, which I hope you have heard of by now, is that awesome new workflow we released in March 2014, which was the core feature of the GATK 3.0 version. It solves the N+1 problem and allows you to perform joint variant analysis on ridiculously large cohorts without having to enslave the entire human race and turning people into batteries to power a planet-sized computing cluster. More on that later (omg we're writing a paper on it, finally!).

You can read the full list of improvements we've made to the tools involved in the workflow (mainly HaplotypeCaller and Genotype GVCFs) in Eric's (unusually detailed) Release Notes for this version. The ones you are most likely to care about are that the "missing PLs" bug is fixed, GenotypeGVCFs now accepts arguments that allow it to emulate the HC's genotyping capabilities more closely (such as --includeNonVariantSites), the AB annotation is fully functional, reference DPs are no longer dropped, and CatVariants now accepts lists of VCFs as input. OK, so that last one is not really specific to the reference model pipeline, but that's where it really comes in handy (imagine generating a command line with thousands of VCF filenames -- it's not pretty).

### HaplotypeCaller now emits post-realignment coverage metrics

The coverage metrics (DP and AD) reported by HaplotypeCaller are now those calculated after the HC's reassembly step, based on the reads having been realigned to the most likely haplotypes. So the metrics you see in the variant record should match what you see if you use the -bamout option and visualize the reassembled ActiveRegion in a genome browser such as IGV. Note that if any of this is not making sense to you, say so in the comments and we'll point you to the new HaplotypeCaller documentation! Or, you know, look for it in the Guide.

### R you up to date on your libraries?

We updated the plotting scripts used by BQSR and VQSR to use the latest version of ggplot2, to get rid of some deprecated function issues. If your Rscripts are suddenly failing, you'll need to update your R libraries.

### A sincere apology to GATK-based tool developers

We're sorry for making you jump through all these hoops recently. As if the switch to Maven wasn't enough, we have now completed a massive reorganization/renaming of the codebase that will probably cause you some headaches when you port your tools to the newest version. But we promise this is the last big wave, and ultimately this will make your life easier once we get the GATK core framework to be a proper maven artifact.

In a nutshell, the base name of the codebase has changed from sting to gatk (which hopefully makes more sense), and the most common effect is that sting.gatk classpath segments are now gatk.tools. This, by the way, is why we had a bunch of broken documentation links; most of these have been fixed (yay symlinks) but there may be a few broken URLs remaining. If you see something, say something, and we'll fix it.

#### blueskypy

hi, Geraldine, Thanks so much for the new update! Just two questions: 1. if the output vcf file name ends with vcf.gz, can version 3.2 produce correct gz files? 2. Is it recommended to re-run variant callings produced by GATK v3.1?

Wed 30 Jul 2014

#### GATK Dev Team

###### @gatk_dev

#GATK workshop crew is in Basel, ready to roll! @ISBSIB https://t.co/m56JzpC1bN
###### 25 Sep 16
@thatdnaguy That's right, we've retired it, see https://t.co/epbvwOQVTt
###### 23 Sep 16
@geoffjentry @BroadGenomics Ah, you should ask @WDL_dev on the WDL forum then :)
###### 21 Sep 16
@geoffjentry @BroadGenomics If you're in a hurry to get answers, consider posting in our support forum ;)
###### 21 Sep 16
We'll be at #ASHG16 along with @BroadGenomics. Come talk to us at booth 329! https://t.co/NvMHDNGTmo

###### Our favorite tweets from others

I've easily written my first custom ReadFilter for GATK. The @gatk_dev 's toolkit is a great example of programming.
###### 21 Sep 16
@gatk_dev "make it so"
###### 8 Sep 16
it's the nightly build owl for GATK :D https://t.co/OwTRrk6KHA https://t.co/rfmAbdIIQp
###### 11 Aug 16
We're going to make an hg38 version of ExAC. And we'll make @dgmacarthur pay for it. #BioinformaticsCampaignPromises
###### 2 Aug 16
You’re right @gatk_dev honesty is key! About variants manual filtering: “In any case you're probably in for a world of pain.” Ha now I know!
###### 11 Jul 16
See more of our favorite tweets...