This is becoming a bit of a yearly tradition; next week we're heading over to Bio-IT World Expo in Boston (so a short hop across the Charles River) to announce the majorly rebooted version of GATK which we've affectionately dubbed GATK4. Because it will be version 4.

Look, if you've ever seen the names we give our tools, you know that naming things isn't exactly where we put our creativity to work. It's a precious resource, and anyway we rather like things to be self-explanatory.

Yes, technically we already announced GATK4 at Bio-IT last year, but no, this is not a re-run. Last year was a heads-up that we were working on this significant new reimplementation of the toolkit. We were mostly there to talk about the core features of the new framework, which famously excited the Spark-savvy in the crowd (because it supports Apache Spark). But it was definitely still under heavy development; while we had the CNV tools just about ready for testing, as I recall there wasn't even a glimmer of the HaplotypeCaller in there yet.

This year is very different. We have a toolkit that is in the final stages of polishing up for public consumption. We have multiple Best Practices workflows, because we're not just about the germline SNPs and indels anymore. And we also have numbers. Dates for the beta and full releases, performance estimates...

All of which we'll present during a luncheon event we're holding with our wonderful partners at Intel Life Sciences, who have contributed some of GATK4's key new features. The luncheon will take place Wednesday the 24th at 12:40 PM, at a location TBD (because I can't figure it out from the Bio-IT program, which is not self-explanatory). We'll be in Track 1: Data and Storage Management, which may sound super boring (no offense to other speakers in this track) but come on and join us if you can; I predict you'll be pleasantly surprised.

As a coda, we'll be holding Q&A sessions in the Intel Hospitality Suite, aka Dartmouth room in the WTC, at the following times: Wednesday the 24th from 1:30 PM to 3:15 PM, and Thursday the 25th from 10:30 to 11:30 AM. Swing on by if you have any burning questions about GATK4.

We look forward to seeing you there! And if you can't make it because of trivial considerations like geographical incompatibility (oceans, shmoceans), check out this blog or follow @gatk_dev on Twitter. We'll post a summary of the announcements shortly after the luncheon presentation.


Here's the program abstract:

12:40 Luncheon Presentation I: Broad Institute & Intel GATK 4.0 Optimization Overview Eric Banks, Senior Director, Data Science and Data Engineering Group, Broad Institute Geraldine Van der Auwera, Associate Director, Outreach and Communications, GATK, Broad Institute Mark Bagley, Director, Center for Genomic Data Engineering, Intel Paolo Narvaez, Senior Director, Engineering, Intel Genomics research leader the Broad Institute of MIT and Harvard joins Intel to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Topics include performance best practices and the latest on Genomics DB and FireCloud.

Note that we're not actually going to talk about FireCloud at the luncheon event (what can I say, abstracts are immutable descriptors of mutable structures) but we will be doing demos of FireCloud throughout Bio-IT at the Google booth. A more detailed announcement will be posted shortly to that effect on the FireCloud blog.

And look, our GATK4 luncheon made it into the official Bio-IT preview!

Eric Banks and Geraldine Van der Auwera of the Broad Institute along with Mark Bagley and Paolo Narvaez of Intel will co-host a luncheon session to describe their collaboration to enhance the GATK environment and scale researchers’ ability to analyze massive amounts of genomic data from diverse sources worldwide. Wednesday, May 24, 12:40 pm


Return to top

Geraldine_VdAuwera on 19 May 2017


Here's a map of where we'll be doing the Q&A sessions, since the Bio-IT website is rather impenetrable.




- Recent posts


- Upcoming events

See Events calendar for full list and dates


- Recent events

See Events calendar for full list and dates



- Follow us on Twitter

GATK Dev Team

@gatk_dev

It's hot, it's humid, it's #ASHG19 in Houston, TX. Join us at @broadgenomics booth 714 in the exhibition hall to ch… https://t.co/An0WXnYw7z
16 Oct 19
Interested in hearing more about our DRAGEN-GATK partnership with @illumina? Fill out this survey to let us know yo… https://t.co/7Fadggm7Rp
16 Oct 19
RT @datadriveby: GATK and DRAGEN collaboration presented by @VdaGeraldine of @gatk_dev and @delagoya of @illumina at #ASHG19. Interesting t…
15 Oct 19
Questions about our new partnership with @illumina DRAGEN? Check out the blog post and handy graphic that explains… https://t.co/fBnjh45E7o
1 Oct 19
Enter the DRAGEN-GATK: Get the lowdown on our freshly announced collaboration with the @illumina DRAGEN team at https://t.co/3ILTJZ09e5
30 Sep 19

- Our favorite tweets from others

DRAGEN-GATK roadmap looking very interesting. Several complementary options will be available for running stuff on-… https://t.co/jxizQkM3q6
15 Oct 19
As a prior card carrying bioinformatician, it’s great to see @illumina and @broadinstitute coming together to solve… https://t.co/xGZqi8NmT4
15 Oct 19
GATK and DRAGEN collaboration presented by @VdaGeraldine of @gatk_dev and @delagoya of @illumina at #ASHG19. Intere… https://t.co/nbE8HGoOfu
15 Oct 19
In a new collaboration, the @gatk_dev team and the @illumina DRAGEN Bio-IT Platform are co-developing open-source g… https://t.co/oPjjk1lBqY
30 Sep 19
Do you want to learn about sequencing data analysis in an amazing city? Register now at @gatk_dev workshop "From re… https://t.co/ISBVX2Xwr5
3 Sep 19

See more of our favorite tweets...