This is becoming a bit of a yearly tradition; next week we're heading over to Bio-IT World Expo in Boston (so a short hop across the Charles River) to announce the majorly rebooted version of GATK which we've affectionately dubbed GATK4. Because it will be version 4.

Look, if you've ever seen the names we give our tools, you know that naming things isn't exactly where we put our creativity to work. It's a precious resource, and anyway we rather like things to be self-explanatory.

Yes, technically we already announced GATK4 at Bio-IT last year, but no, this is not a re-run. Last year was a heads-up that we were working on this significant new reimplementation of the toolkit. We were mostly there to talk about the core features of the new framework, which famously excited the Spark-savvy in the crowd (because it supports Apache Spark). But it was definitely still under heavy development; while we had the CNV tools just about ready for testing, as I recall there wasn't even a glimmer of the HaplotypeCaller in there yet.

This year is very different. We have a toolkit that is in the final stages of polishing up for public consumption. We have multiple Best Practices workflows, because we're not just about the germline SNPs and indels anymore. And we also have numbers. Dates for the beta and full releases, performance estimates...

All of which we'll present during a luncheon event we're holding with our wonderful partners at Intel Life Sciences, who have contributed some of GATK4's key new features. The luncheon will take place Wednesday the 24th at 12:40 PM, at a location TBD (because I can't figure it out from the Bio-IT program, which is not self-explanatory). We'll be in Track 1: Data and Storage Management, which may sound super boring (no offense to other speakers in this track) but come on and join us if you can; I predict you'll be pleasantly surprised.

As a coda, we'll be holding Q&A sessions in the Intel Hospitality Suite, aka Dartmouth room in the WTC, at the following times: Wednesday the 24th from 1:30 PM to 3:15 PM, and Thursday the 25th from 10:30 to 11:30 AM. Swing on by if you have any burning questions about GATK4.

We look forward to seeing you there! And if you can't make it because of trivial considerations like geographical incompatibility (oceans, shmoceans), check out this blog or follow @gatk_dev on Twitter. We'll post a summary of the announcements shortly after the luncheon presentation.

Here's the program abstract:

12:40 Luncheon Presentation I: Broad Institute & Intel GATK 4.0 Optimization Overview Eric Banks, Senior Director, Data Science and Data Engineering Group, Broad Institute Geraldine Van der Auwera, Associate Director, Outreach and Communications, GATK, Broad Institute Mark Bagley, Director, Center for Genomic Data Engineering, Intel Paolo Narvaez, Senior Director, Engineering, Intel Genomics research leader the Broad Institute of MIT and Harvard joins Intel to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Topics include performance best practices and the latest on Genomics DB and FireCloud.

Note that we're not actually going to talk about FireCloud at the luncheon event (what can I say, abstracts are immutable descriptors of mutable structures) but we will be doing demos of FireCloud throughout Bio-IT at the Google booth. A more detailed announcement will be posted shortly to that effect on the FireCloud blog.

And look, our GATK4 luncheon made it into the official Bio-IT preview!

Eric Banks and Geraldine Van der Auwera of the Broad Institute along with Mark Bagley and Paolo Narvaez of Intel will co-host a luncheon session to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Wednesday, May 24, 12:40 pm

Return to top

Geraldine_VdAuwera on 19 May 2017

Here's a map of where we'll be doing the Q&A sessions, since the Bio-IT website is rather impenetrable.

- Recent posts

- Upcoming events

See Events calendar for full list and dates

- Recent events

See Events calendar for full list and dates

- Follow us on Twitter

GATK Dev Team


@geoffjentry @nilshomer Willing to provide emotional support (no geographic limitations)
21 Jun 18
@geoffjentry @nilshomer To be fair we do have users in parts of the world with really bad network, so it's not like…
21 Jun 18
@nilshomer @geoffjentry But we’re working on slimming down the docker — there’s a PR in review that’s going to make…
21 Jun 18
@nilshomer @geoffjentry Though if you want to talk about the size of our docker, yeah that’s embarrassing :-|
21 Jun 18
@nilshomer @geoffjentry At the risk of coming off a tad defensive, that’s not “a” tool, it’s a toolkit that include…
21 Jun 18

- Our favorite tweets from others

Performance benchmarking of GATK3.8 and GATK4 (Spoiler: GATK4's significantly faster) by…
19 Jun 18
Weronika Gutowska-Ding: In the EMQN NGS rounds, BWA and GATK the most used aligner and variant caller. Variant call…
16 Jun 18
Thank you to @broadinstitute & @geoffjentry and their amazing software engineering team for hosting my talk today.…
11 Jun 18
Congrats! It’s great to see @WDL_dev getting stronger adoption for portable workflows in the bioinformatics communi…
5 Jun 18
Let's talk about @WDL_dev: a new #GCCBOSC 2018 BOF: (see also…
1 Jun 18

See more of our favorite tweets...