This is going to be a short one, folks. The 2.5 release is pretty much all about bug fixes, with a couple of exceptions that we'll cover below.

Bug fixes

Remember how we said that version 2.4 was going to be the least buggy ever? Well, that might have been a bit optimistic. We had a couple of stumpers in there -- and a flurry of little ones that were probably not novel (i.e. not specific to version 2.5) but finally bubbled up to the surface. We're not going to go over the bug fixes in detail, since the release notes include a comprehensive list. Basically, those are all fixed.

Actual features!

Well, not exactly new features, but noteworthy improvements to existing tools.

- ReduceReads turns the squeeze dial up to eleven

In addition to countless bug fixes, we've made drastic improvements to ReduceReads' compression algorithm, so you can now achieve much better compression rates without compromising on the retention of informative data. Keep in mind of course that as always, you'll see much bigger gains on certain types of data sets -- the higher the coverage in your original BAM files, the bigger the savings in file size and performance of the downstream tools.

- HaplotypeCaller is faster and more accurate! No, really!

We say this every time, and every time it's true: we've made some more improvements to the HaplotypeCaller that make it faster and more accurate. Well, it's still slower than the UnifiedGenotyper, in case you were going to ask (of course you were). But on the accuracy front, we say this without reservation or caveat: HC is now just as accurate as the UG for calling SNPs, and it is in a league of its own for calling indels. If you are even remotely interested in indels you should absolutely take it out for a spin. Go. Now.

- DiagnoseTargets, all grown up

Say goodbye to the mood swings and the pimples; it looks like this tool's awkward teenager phase is finally over. We've entirely reworked how DiagnoseTargets functions so it now uses a plugin system, which we think is much more convenient. This plugin system will be explained in detail in a forthcoming documentation article.

- Functional annotation recovers some functionality

You may be aware that we had imposed a freeze of sorts on the annotation database version that could be used with the snpEff annotation. Well, we're happy to report that the author of the snpEff software package has made some significant upgrades, including a feature called GATK compatibility mode. As a result there is no longer any version constraint. We'll be updating our documentation on using snpEff with GATK soon (-ish), but in the meantime, feel free to go forth and annotate away. Just make sure to consult the snpEff manual for relevant information on using it with GATK.

Deprecation alerts

Even as the dev team giveth, the dev team taketh away.

A few annotations were removed from the VariantAnnotator stables (as listed in the release notes), mainly because they didn't work properly. With all the caveats about how GATK is research software, we're still committed to providing quality tools that do something close to what they're advertised to do, at the bare minimum. If something doesn't fulfill that requirement, it's out.

We've also disabled the auto-generation of fai/dict files for fasta references. I can hear some of you groaning all the way from here. Yes, it was convenient -- but far too buggy. Come on people, it's a one-liner using Picard. Oh, and we're no longer allowing the use of compressed (.gz) references either -- also too buggy. The space savings were simply not worth the headaches.


Thanks for the update! Is it possible to see an updated version of these two graphs ( http://cdn.vanillaforums.com/gatk.vanillaforums.com/FileUpload/aa/44374a5788cf46b86ba171fe7d1a1d.png and http://cdn.vanillaforums.com/gatk.vanillaforums.com/FileUpload/93/82013205fa67822a91b44d57b9f645.png ) for the 2.5?

Thu 9 May 2013

At a glance

Follow us on Twitter

GATK Dev Team


RT @BroadGenomics: Missed #GATK WKSP at #ASHG16? 10am meet Geraldine at booth 329 - Broad’s GATK Guru @gatk_dev #BroadGenomicsExperts https…
20 Oct 16
RT @NJL_NGS: Broad Institute Workbench workshop now at #ashg16. https://t.co/t6L452mp8a
19 Oct 16
RT @konradjk: We're rebranding a bit! gnomAD now adds WGS regions, as well as doubling the data depth and increasing diversity (5K ASJ!) #A…
19 Oct 16
RT @NJL_NGS: https://t.co/mf0zBZrEfg now launched!
19 Oct 16
RT @dgmacarthur: For those who missed it: we just posted variant frequencies from 126,216 exomes and 15,136 genomes at https://t.co/uMAoxqh
19 Oct 16

Our favorite tweets from others

Asked a question, speaker jokes not to edit your BAM header and expect GATK to let you get away with it.
19 Oct 16
My new hobby: finding incomprehensible diagrams on office whiteboards and adding alarming conclusions to them https://t.co/7RAzBF0kYh
30 Sep 16
I've easily written my first custom ReadFilter for GATK. The @gatk_dev 's toolkit is a great example of programming.
21 Sep 16
@gatk_dev "make it so"
8 Sep 16
it's the nightly build owl for GATK :D https://t.co/OwTRrk6KHA https://t.co/rfmAbdIIQp
11 Aug 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg ashg16 benchmarks best-practices bug bug-fixed cancer cloud cluster cnv commandline commandlinegatk community compute conferences cram cromwell denovo depthofcoverage diagnosetargets error fix forum gatk3 gatk4 genotype genotype-refinement genotypegvcfs google grch38 gvcf haploid haplotypecaller hg38 holiday hts htsjdk ibm java8 job job-offer jobs license meetings mendelianviolations multithreading mutect mutect2 ngs nt outreach pairhmm parallelism patch performance phone-home picard pipeline plans ploidy polyploid poster presentations printreads profile promote release release-notes rnaseq runtime saas script search selectvariants sequencing service slides snow speed status sting support syntax talks team terminology third-party-tools topstory trivia troll tutorial unifiedgenotyper vcf-gz version-highlights versions vqsr wdl webinar workflow workshop