Latest posts
 


As many of you know, GATK4 Beta is out and we are excited for the full GATK4 to be released in the new year. It has been a long time coming and we hope that many of you have gotten to experiment with its features before the big release. In fact, we’d like to know if you’ve tried it out! We crafted a survey that asks questions about your experience with GATK, the Beta release, thoughts on the upcoming GATK4 release and the infrastructure you run on to help improve our team’s communication, support, and product development efforts. If you have not used the Beta or do not know much about GATK4, please tell us in this survey, as that is very useful for us to know. This survey is for anyone who has ever used any GATK version.

The Survey

The survey is 27 questions long and should take 15-20 minutes to complete. We want to compensate you for your time so we have gotten some reward funding from the Intel Center for Genomic Data Engineering, established at the Broad Institute in 2017.

We will randomly draw 100 survey participants and offer each winner one prize of their choosing:

  • $50 Amazon gift card (85 available)
  • $250 FireCloud credit (10 available)
  • $500 FireCloud credit (5 available)

Read the whole post
See comments (0)



A long time ago in a galaxy far, far away, we started work on a brand new version of GATK in which the engine framework was to be completely revamped, streamlined and accelerated, with support for cloud technologies and an impressively expanded scope of analysis (copy number! structural variation! somatic and germline versions of everything!). Oh, and it would be fully open-source.

Today that new beginning is tantalizingly close to fruition. We've had a series of beta versions out for preview for about three months, and we've actually had several segments of our genome production pipeline running a subset of fully-vetted GATK4 tools for over a year. Aside from a few remaining technical issues that are actively being addressed, the work left to be done before general release mainly involves clean-up and streamlining of user-facing functionality: what gets logged and how, argument names and syntax, documentation and so on.

So it's time to set a date and put a ring on it! I'm thrilled to announce the happy event will take place on Jan 9, 2018.


Read the whole post
See comments (4)



Fall is my favorite season -- it combines the best weather in New England and the most active period of the year for GATK events and announcements (although sometimes the latter means we don't get to go out and enjoy the former as much as we'd like). In keeping with that, we have a couple of important announcements that will go out early next week. However I'm breaking radio silence now to give you a quick update on what's been happening with regard to events and workshops, specifically.

Coming soonest: FLOW pipelining workshop showcasing GATK4 pipelines, organized by DNAstack in Orlando, Oct 17

Plus upcoming GATK workshops and links to recent workshop materials.

Details below!


Read the whole post
See comments (0)



One more 3.x version, for the road! That's right, even as we're ramping up our efforts on GATK4 (we're three beta releases in at this point, and getting down to brass tacks writing the migration guide ahead of the 4.0 general release) we still found it worthwhile to cut one last release of GATK3.

Our main motivation here is to introduce the Intel Genomics Kernel Library, which comes bearing the gift of speed improvements for those of you who won't be able to migrate to GATK4 right away.

As a secondary benefit, this version includes a handful of bug fixes, some usability improvements including better error messages, documentation fixes and logging tweaks, and a few improvements to annotation calculations (especially in allele-specific mode), which you'll find described briefly in the release notes. No big changes though, except perhaps the new default behavior of VariantsToTable with regard to missing annotation values, discussed below. Finally, we've committed a copy of all the peripheral documentation (= the docs that live in the forum and complement the tool documentation) to the now-old GATK codebase.

And thus, the last-ever GATK3 version emerges covered in carbonite.


Read the whole post
See comments (8)



GATK 3.8 was released on July 28, 2017. Itemized changes are listed below. For more information on what you should care about, see the user-friendly version highlights.


Read the whole post
See comments (0)



科研圈的亲们,我们来啦!携手国内重量级公司和机构,我们这次给大家带来了高效、规模化使用GATK的技巧!

Today we are reaching out to the Chinese research community with great news: we are partnering with key companies and institutions in China to empower Chinese researchers to use GATK effectively and at scale.


Read the whole post
See comments (10)



We've been getting so caught up in the excitement of the imminent beta release of GATK4 (possibly later this week!), we forgot to announce upcoming workshops! And two of them are coming up fast, in just a month from now. Specifically, we'll be in Cambridge, UK, July 12-14 and then in Edinburgh, UK, July 17-19. There's still time to register for both but the hands-on sessions have limited space, so don't wait around!

GATK is also going to be making a brief appearance at BOSC '17 in Prague, CZ, July 21-22. Our team member Kate Voss will give a lightning talk and present a poster about our genomics pipelining stack that is composed of GATK4+WDL+Cromwell. I'm frankly delighted that our abstract was accepted as a late-breaking submission; it's a pleasure to kick off the new open-source era of GATK at the most open-sourcey meeting of the year!

Going forward, be sure to check out the new Events calendar feature we just now added to the website to help you keep track of events more systematically.

See comments (3)



This is one of two posts announcing the imminent beta release of GATK4; for a technical description of features, see this other post.

"Wait, what?" Yes, you read that right, we're moving GATK4 to a fully open source license -- specifically, BSD 3-clause. And to be clear, this applies to all of GATK4. Not just the core framework (which, little known fact, has always been open source), but all the tools that were previously "protected", including HaplotypeCaller, the new CNV discovery tools, everything. The whole enchilada.


Read the whole post
See comments (13)



Unboxing GATK4

Posted by Geraldine_VdAuwera on 24 May 2017 (5)


This is one of two posts announcing the imminent beta release of GATK4; for details about the open-source licensing, see this other post.

You've probably heard it by now: we are on the cusp of releasing GATK4 into beta status (targeting mid-June), and we plan to push out a general release shortly thereafter (targeting midsummer). That's great. So what's in the box?

Over two years of active development have gone into producing GATK4, and I'm happy to say we have plenty to show for it. Specifically, we've pushed the evolution of GATK on three fronts: (1) technical performance, i.e. speed and scalability; (2) new functionality and expanded scope of analysis, e.g. we can do CNVs now; and (3) openness to collaboration, through open-sourcing as well as general developer-friendliness (documented code! consistent APIs! clear contribution guidelines!).

Want more detail? Let me give you a tour of the highlights, using slides from the presentation I gave at Bio-IT earlier today (code reuse: it's not just for code anymore).


Read the whole post
See comments (5)



This is becoming a bit of a yearly tradition; next week we're heading over to Bio-IT World Expo in Boston (so a short hop across the Charles River) to announce the majorly rebooted version of GATK which we've affectionately dubbed GATK4. Because it will be version 4.

Look, if you've ever seen the names we give our tools, you know that naming things isn't exactly where we put our creativity to work. It's a precious resource, and anyway we rather like things to be self-explanatory.

Yes, technically we already announced GATK4 at Bio-IT last year, but no, this is not a re-run. Last year was a heads-up that we were working on this significant new reimplementation of the toolkit. We were mostly there to talk about the core features of the new framework, which famously excited the Spark-savvy in the crowd (because it supports Apache Spark). But it was definitely still under heavy development; while we had the CNV tools just about ready for testing, as I recall there wasn't even a glimmer of the HaplotypeCaller in there yet.

This year is very different. We have a toolkit that is in the final stages of polishing up for public consumption. We have multiple Best Practices workflows, because we're not just about the germline SNPs and indels anymore. And we also have numbers. Dates for the beta and full releases, performance estimates...

All of which we'll present during a luncheon event we're holding with our wonderful partners at Intel Life Sciences, who have contributed some of GATK4's key new features. The luncheon will take place Wednesday the 24th at 12:40 PM, at a location TBD (because I can't figure it out from the Bio-IT program, which is not self-explanatory). We'll be in Track 1: Data and Storage Management, which may sound super boring (no offense to other speakers in this track) but come on and join us if you can; I predict you'll be pleasantly surprised.

As a coda, we'll be holding Q&A sessions in the Intel Hospitality Suite, aka Dartmouth room in the WTC, at the following times: Wednesday the 24th from 1:30 PM to 3:15 PM, and Thursday the 25th from 10:30 to 11:30 AM. Swing on by if you have any burning questions about GATK4.

We look forward to seeing you there! And if you can't make it because of trivial considerations like geographical incompatibility (oceans, shmoceans), check out this blog or follow @gatk_dev on Twitter. We'll post a summary of the announcements shortly after the luncheon presentation.


Read the whole post
See comments (1)



Latest posts
 

- Recent posts


- Upcoming events

See Events calendar for full list and dates


- Recent events

See Events calendar for full list and dates



- Follow us on Twitter

GATK Dev Team

@gatk_dev

It's official: GATK 4.0 will be released Jan 9, 2018 https://t.co/e76tduFJKk
16 Oct 17
GATK events and workshops update https://t.co/cnn39FPBzY
13 Oct 17
RT @BroadGenomics: Join our @WDL_dev & @gatk_dev teams to learn more on reproducible workflows! #ASHG17 https://t.co/qt6nypKSLc
13 Oct 17
We're excited to be part of this evening workshop demoing the GATK4 WDL pipelines! https://t.co/ScyNqIvmsl
6 Oct 17
I added a video to a @YouTube playlist https://t.co/JhzxN3INZ9 MPG Primer: High throughput sequencing and variant calling pipelines
6 Oct 17

- Our favorite tweets from others

Although it made me cry sometimes, I owe them a lot and love them much more. https://t.co/vUj0cBllgn
16 Oct 17
Round of applause at #BOSC2017 for GATK4 being open sourced. https://t.co/WRhTeKtKTX
23 Jul 17
Round of applause for @Katewanders - Broad Institute will open source data science tools from now on https://t.co/CvLhwgBQUK #BOSC2017
23 Jul 17
The @gatk_dev team, that delivered an excellent "GATK Best Practices for Variant Discovery" workshop this week, on… https://t.co/Z8GNduuDeJ
20 Jul 17
Amazing session @edgenome with @gatk_dev comes to an end. Enriched learning! Thx #gatk #gatk2017 #Genomics #Edinburgh #Bioinformatics
19 Jul 17
See more of our favorite tweets...