## Version highlights for GATK version 3.2

### Posted by Geraldine_VdAuwera on 30 Jul 2014 (2)

Better late than never (right?), here are the version highlights for GATK 3.2. Overall, this release is essentially a collection of bug fixes and incremental improvements that we wanted to push out to not keep folks waiting while we're working on the next big features. Most of the bug fixes are related to the HaplotypeCaller and its "reference confidence model" mode (which you may know as -ERC GVCF). But there are also a few noteworthy improvements/changes in other tools which I'll go over below.

### Working out the kinks in the "reference confidence model" workflow

The "reference confidence model" workflow, which I hope you have heard of by now, is that awesome new workflow we released in March 2014, which was the core feature of the GATK 3.0 version. It solves the N+1 problem and allows you to perform joint variant analysis on ridiculously large cohorts without having to enslave the entire human race and turning people into batteries to power a planet-sized computing cluster. More on that later (omg we're writing a paper on it, finally!).

You can read the full list of improvements we've made to the tools involved in the workflow (mainly HaplotypeCaller and Genotype GVCFs) in Eric's (unusually detailed) Release Notes for this version. The ones you are most likely to care about are that the "missing PLs" bug is fixed, GenotypeGVCFs now accepts arguments that allow it to emulate the HC's genotyping capabilities more closely (such as --includeNonVariantSites), the AB annotation is fully functional, reference DPs are no longer dropped, and CatVariants now accepts lists of VCFs as input. OK, so that last one is not really specific to the reference model pipeline, but that's where it really comes in handy (imagine generating a command line with thousands of VCF filenames -- it's not pretty).

### HaplotypeCaller now emits post-realignment coverage metrics

The coverage metrics (DP and AD) reported by HaplotypeCaller are now those calculated after the HC's reassembly step, based on the reads having been realigned to the most likely haplotypes. So the metrics you see in the variant record should match what you see if you use the -bamout option and visualize the reassembled ActiveRegion in a genome browser such as IGV. Note that if any of this is not making sense to you, say so in the comments and we'll point you to the new HaplotypeCaller documentation! Or, you know, look for it in the Guide.

### R you up to date on your libraries?

We updated the plotting scripts used by BQSR and VQSR to use the latest version of ggplot2, to get rid of some deprecated function issues. If your Rscripts are suddenly failing, you'll need to update your R libraries.

### A sincere apology to GATK-based tool developers

We're sorry for making you jump through all these hoops recently. As if the switch to Maven wasn't enough, we have now completed a massive reorganization/renaming of the codebase that will probably cause you some headaches when you port your tools to the newest version. But we promise this is the last big wave, and ultimately this will make your life easier once we get the GATK core framework to be a proper maven artifact.

In a nutshell, the base name of the codebase has changed from sting to gatk (which hopefully makes more sense), and the most common effect is that sting.gatk classpath segments are now gatk.tools. This, by the way, is why we had a bunch of broken documentation links; most of these have been fixed (yay symlinks) but there may be a few broken URLs remaining. If you see something, say something, and we'll fix it.

#### blueskypy

hi, Geraldine, Thanks so much for the new update! Just two questions: 1. if the output vcf file name ends with vcf.gz, can version 3.2 produce correct gz files? 2. Is it recommended to re-run variant callings produced by GATK v3.1?

Wed 30 Jul 2014

#### GATK Dev Team

###### @gatk_dev

RT @dgmacarthur: Get in on the ground floor with an amazing team building software that's already transforming genomic analysis. https://t.…
###### 26 May 17
I added a video to a @YouTube playlist https://t.co/fpNmKf6jlP GATK4: speed optimizations, new tools, and open source licensing
###### 26 May 17
I added a video to a @YouTube playlist https://t.co/Bur7IbDefW GATK4: speed optimizations, new tools, and open source licensing - Open
###### 26 May 17
I added a video to a @YouTube playlist https://t.co/y2zRjExH9v GATK4: speed optimizations, new tools, and open source licensing -
###### 26 May 17
RT @BroadGenomics: @gatk_dev experts are at the @IntelHealth Hospitality Suite (Dartmouth Room) until 11:30am today! Stop by to ask about G…

###### Our favorite tweets from others

Huge thanks to the @gatk_dev team: they return to BSD license (https://t.co/xW80GJctrT)! Watch out for the #GATK package in #Bioconda!
###### 26 May 17
This is great GATK @gatk_dev 4 open source (again), BSD3! 💯 https://t.co/jmsStAVE6S
###### 25 May 17
Wooow, really exiting and cheerful news! Will load it up on our server for sure! Congrats @gatk_dev https://t.co/9ppcH4I4Mh
###### 25 May 17
Kudos and congratulations to the @broadinstitute and @gatk_dev for open source release of GATK4 and other tools.
###### 25 May 17
best error output: Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
###### 4 Apr 17
See more of our favorite tweets...