Today in the US we're celebrating the national "Stuffing Your Face" holiday known as Thanksgiving, and we get the day off tomorrow to recover, so the forum is going to be unattended until Monday Nov 27.
So whether you're in the US and looking for an escape from your in-laws, or you're in some other part of the world and waiting for an answer to your pressing GATK question... Why not take a break, slip away for a bit and read the HaplotypeCaller paper, which is finally out in preprint form on bioarxiv here, under the title "Scaling accurate genetic variant discovery to tens of thousands of samples".
Or do both us and yourself a favor by filling in our GATK survey and winning one of the 100 prizes we're giving away! Seriously, we have a whole bunch of $50 Amazon gift cards (which you can get in your local currency if you live outside the USA) and prizes of up to $500 compute credits to spend in FireCloud, our cloud-based analysis platform. You can read more about the goal of the survey here.
We have identified a major bug in the GenomicsDBImport tool that affects all released beta versions of GATK4 up to 4.beta.5 (inclusive). The bug occurs under specific conditions (detailed below) and causes the output of joint calling to be scrambled across samples, i.e. the sample names will not be associated with the correct sample data. For example, the data for
sample1 may be labeled as
sample3. The good news is that the results are recoverable as long as you have a record of the exact parameters used in the original command.
So if you have used this tool, please read the detailed description of the bug conditions and recovery procedure below. We apologize for any inconvenience this may cause you.
Going forward, everyone who plans to use GenomicsDBImport should upgrade to GATK4 version 4.beta.6 or later.
You may have noticed that we're currently experiencing a surge in spamming on the forum. We're working with our host, Vanilla, to tighten our spam blockers and flood control measures. As part of that we're going to increase the mandatory delay between postings, so you may find yourself unable to post several times in a row. If that happens, you'll need to wait a few minutes and try again. This should help cut down the spam by reducing how quickly spammers can post, which in turn will give us time to ban them, and generally make it less worthwhile for them to attack us.
You may also find yourself unable to post if you never got around to confirming the email you signed up with. Due to an oversight, we weren't requiring email confirmation, which is one of the ways spammers can gain advantage over our controls. So we've closed that hole, but it does mean that if you haven't done it yet, you will now need to go back and confirm your email with the forum. If you can't find the original forum email, click here to get a new confirmation request emailed to you.
I'm very sorry for the inconvenience this represents for all of you legitimate forum denizens, and I hope we can get past this quickly.
The presentation slide decks and hands-on tutorial materials presented at the 5-day GATK workshop in Pretoria, South Africa can be downloaded at this Google Drive link.
As many of you know, GATK4 Beta is out and we are excited for the full GATK4 to be released in the new year. It has been a long time coming and we hope that many of you have gotten to experiment with its features before the big release. In fact, we’d like to know if you’ve tried it out! We crafted a survey that asks questions about your experience with GATK, the Beta release, thoughts on the upcoming GATK4 release and the infrastructure you run on to help improve our team’s communication, support, and product development efforts. If you have not used the Beta or do not know much about GATK4, please tell us in this survey, as that is very useful for us to know. This survey is for anyone who has ever used any GATK version.
The survey is 27 questions long and should take 10 minutes to complete. We want to compensate you for your time so we have gotten some reward funding from the Intel Center for Genomic Data Engineering, established at the Broad Institute in 2017.
We will randomly draw 100 survey participants and offer each winner one prize of their choosing:
A long time ago in a galaxy far, far away, we started work on a brand new version of GATK in which the engine framework was to be completely revamped, streamlined and accelerated, with support for cloud technologies and an impressively expanded scope of analysis (copy number! structural variation! somatic and germline versions of everything!). Oh, and it would be fully open-source.
Today that new beginning is tantalizingly close to fruition. We've had a series of beta versions out for preview for about three months, and we've actually had several segments of our genome production pipeline running a subset of fully-vetted GATK4 tools for over a year. Aside from a few remaining technical issues that are actively being addressed, the work left to be done before general release mainly involves clean-up and streamlining of user-facing functionality: what gets logged and how, argument names and syntax, documentation and so on.
So it's time to set a date and put a ring on it! I'm thrilled to announce the happy event will take place on Jan 9, 2018.
Fall is my favorite season -- it combines the best weather in New England and the most active period of the year for GATK events and announcements (although sometimes the latter means we don't get to go out and enjoy the former as much as we'd like). In keeping with that, we have a couple of important announcements that will go out early next week. However I'm breaking radio silence now to give you a quick update on what's been happening with regard to events and workshops, specifically.
Plus upcoming GATK workshops and links to recent workshop materials.
One more 3.x version, for the road! That's right, even as we're ramping up our efforts on GATK4 (we're three beta releases in at this point, and getting down to brass tacks writing the migration guide ahead of the 4.0 general release) we still found it worthwhile to cut one last release of GATK3.
Our main motivation here is to introduce the Intel Genomics Kernel Library, which comes bearing the gift of speed improvements for those of you who won't be able to migrate to GATK4 right away.
As a secondary benefit, this version includes a handful of bug fixes, some usability improvements including better error messages, documentation fixes and logging tweaks, and a few improvements to annotation calculations (especially in allele-specific mode), which you'll find described briefly in the release notes. No big changes though, except perhaps the new default behavior of VariantsToTable with regard to missing annotation values, discussed below. Finally, we've committed a copy of all the peripheral documentation (= the docs that live in the forum and complement the tool documentation) to the now-old GATK codebase.
And thus, the last-ever GATK3 version emerges covered in carbonite.
GATK 3.8 was released on July 28, 2017. Itemized changes are listed below. For more information on what you should care about, see the user-friendly version highlights.
Today we are reaching out to the Chinese research community with great news: we are partnering with key companies and institutions in China to empower Chinese researchers to use GATK effectively and at scale.
See Events calendar for full list and dates
See Events calendar for full list and dates