This is becoming a bit of a yearly tradition; next week we're heading over to Bio-IT World Expo in Boston (so a short hop across the Charles River) to announce the majorly rebooted version of GATK which we've affectionately dubbed GATK4. Because it will be version 4.

Look, if you've ever seen the names we give our tools, you know that naming things isn't exactly where we put our creativity to work. It's a precious resource, and anyway we rather like things to be self-explanatory.

Yes, technically we already announced GATK4 at Bio-IT last year, but no, this is not a re-run. Last year was a heads-up that we were working on this significant new reimplementation of the toolkit. We were mostly there to talk about the core features of the new framework, which famously excited the Spark-savvy in the crowd (because it supports Apache Spark). But it was definitely still under heavy development; while we had the CNV tools just about ready for testing, as I recall there wasn't even a glimmer of the HaplotypeCaller in there yet.

This year is very different. We have a toolkit that is in the final stages of polishing up for public consumption. We have multiple Best Practices workflows, because we're not just about the germline SNPs and indels anymore. And we also have numbers. Dates for the beta and full releases, performance estimates...

All of which we'll present during a luncheon event we're holding with our wonderful partners at Intel Life Sciences, who have contributed some of GATK4's key new features. The luncheon will take place Wednesday the 24th at 12:40 PM, at a location TBD (because I can't figure it out from the Bio-IT program, which is not self-explanatory). We'll be in Track 1: Data and Storage Management, which may sound super boring (no offense to other speakers in this track) but come on and join us if you can; I predict you'll be pleasantly surprised.

As a coda, we'll be holding Q&A sessions in the Intel Hospitality Suite, aka Dartmouth room in the WTC, at the following times: Wednesday the 24th from 1:30 PM to 3:15 PM, and Thursday the 25th from 10:30 to 11:30 AM. Swing on by if you have any burning questions about GATK4.

We look forward to seeing you there! And if you can't make it because of trivial considerations like geographical incompatibility (oceans, shmoceans), check out this blog or follow @gatk_dev on Twitter. We'll post a summary of the announcements shortly after the luncheon presentation.

Here's the program abstract:

12:40 Luncheon Presentation I: Broad Institute & Intel GATK 4.0 Optimization Overview Eric Banks, Senior Director, Data Science and Data Engineering Group, Broad Institute Geraldine Van der Auwera, Associate Director, Outreach and Communications, GATK, Broad Institute Mark Bagley, Director, Center for Genomic Data Engineering, Intel Paolo Narvaez, Senior Director, Engineering, Intel Genomics research leader the Broad Institute of MIT and Harvard joins Intel to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Topics include performance best practices and the latest on Genomics DB and FireCloud.

Note that we're not actually going to talk about FireCloud at the luncheon event (what can I say, abstracts are immutable descriptors of mutable structures) but we will be doing demos of FireCloud throughout Bio-IT at the Google booth. A more detailed announcement will be posted shortly to that effect on the FireCloud blog.

And look, our GATK4 luncheon made it into the official Bio-IT preview!

Eric Banks and Geraldine Van der Auwera of the Broad Institute along with Mark Bagley and Paolo Narvaez of Intel will co-host a luncheon session to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Wednesday, May 24, 12:40 pm

Return to top

Geraldine_VdAuwera on 19 May 2017

Here's a map of where we'll be doing the Q&A sessions, since the Bio-IT website is rather impenetrable.

- Recent posts

- Upcoming events

See Events calendar for full list and dates

- Recent events

See Events calendar for full list and dates

- Follow us on Twitter

GATK Dev Team


RT @dgmacarthur: We’re looking for a senior software engineer who’ll be embedded in the amazing @hailgenetics team, building code to analyz…
20 Feb 18
RT @broadinstitute: @dgmacarthur @hailgenetics Here's a glimpse of what it's like to be a software engineer @broadinstitute, for those inte…
20 Feb 18
@coregenomics @broadinstitute Working on it
16 Feb 18
RT @broadinstitute: #GATK's Eric Banks on how sequencing centers worked together to create a "functionally equivalent" processing spec, and…
15 Feb 18
RT @broadinstitute: News from @gatk_dev: making sequencing pipelines "functionally equivalent," and the #GATK germline pipeline for $5/geno…
14 Feb 18

- Our favorite tweets from others

@broadinstitute @gatk_dev ... outputting an analysis ready CRAM file - makes me smile and shows the importance of…
16 Feb 18
Taking a break from making figures. SNP analysis with GATK & ncRNA analysis with Cufflinks. Just watching things ru…
15 Feb 18
Convolutional neural nets for variant calling hard to explain but it’s essentially what happens if you take acid and try to use IGV #agbt18
14 Feb 18
dedicating my #GalentinesDay to my gal of the moment, the broad's gatk website 💖💖💖💖
13 Feb 18
inspiring work. Kudos to GATK team.
13 Feb 18

See more of our favorite tweets...