Latest posts
 


The presentation slide decks and hands-on tutorial materials can be downloaded at this Google Drive link.

See comments (0)



Cross-posted from https://github.com/broadinstitute/picard/issues/647

For many years now we’ve been hearing from users of both GATK and Picard about how they’d love to see the two projects unite into a single "toolkit-to-rule-them-all", for the sake of user convenience, to promote consistency across tools, and to minimize duplication of effort.

With the advent of GATK 4 this suddenly became a real possibility, as the decision was made to start the new GATK codebase from the Picard base classes rather than the old GATK 3.x base classes. This allows for free-form Picard-style tools and GATK “walkers” built upon an engine traversal to peacefully co-exist within the same framework. Last year, a Picard engineer successfully ported all Picard tools to the GATK 4 codebase with only minor changes to the tools themselves. More recently, efforts have been made to harmonize the build systems of the two projects, resulting in Picard’s recent move to gradle.

Importantly, the core GATK 4 codebase at https://github.com/broadinstitute/gatk is released entirely under the BSD 3-clause license, a big improvement over the confusing licensing situation in GATK 3.x, with its mix of open-source and proprietary licenses within the same repository -- and that is where any Picard tools moved to the GATK 4 codebase would live, remaining fully open-sourced and free for all.

As all of the technical pieces are now in place to allow for a merger of the two projects (with the guarantee that the open-source nature of Picard code will be preserved) we are soliciting feedback from the Picard developer community about the prospect of a union with GATK. Would people here be generally in favor of such a move? Are there any strong objections to this idea? Any concerns that should be addressed before we head any further down this path?


Read the whole post
See comments (1)



We are starting official support of GRCh38, a reference genome with alternate contigs.

In fact, going forward all of our new projects will use GRCh38. During this transition over the coming year, we will keep supporting GRCh37/hg19. Here are nine takeaways to help you get started in using the latest reference.


Read the whole post
See comments (1)



Believe it or not we've done seven workshops so far this year, spread across five countries, spanning three continents -- the furthest ones in Australia (Sydney and Melbourne) and the most recent one in Helsinki, Finland. That's a lot of flying but on the bright side, now I have Gold status on American Airlines (hello fast track lane).

So after a restful summer hiatus we're gearing up to revisit continental Europe -- specifically, we're heading to Basel, Switzerland, at the invitation of the Swiss Institute of Bioinformatics.

We'll be following our standard formula of one day of lectures focused on the Best Practices for variant discovery, and one day of optional hands-on practical sessions demonstrating key steps of analysis and interpretation. The registration page is now live at this link: http://www.sib.swiss/training/upcoming-training-events/training/gatk-workshop-lecture

One important note: we'll be offering the day of hands-on practicals twice (in order to serve more people), which is why the workshop dates span three days -- but to be clear each person will only attend two days out of the three (the lectures are on Day 1 for everyone). The practical sessions have limited space, and tend to fill up fast, so don't wait too long to register -- especially if you have a strong preference about which of the two optional days would work better for you.

If you can't make it to Basel, the next workshops will be at Broad in Boston/Cambridge (USA) on November 7-8, then at VIB in Leuven (Belgium), dates TBD (probably February). Details will follow in due time.

We look forward to seeing many of you in Basel!

See comments (0)



Folks, it really makes my day when I get to announce some good news that has been cooking for a long time. So this is going to be a very happy Humpday indeed.

The good news (which I may have hinted at previously) is that we are making our production pipeline scripts public, starting with the one that implements our Best Practices for data pre-processing and initial variant calling (aka GVCF generation) in whole genomes. Not only that, all Grch38/Hg38 resource files needed to run it, plus test data, are in a Google Cloud bucket. In time the bucket will replace our not-so-reliable FTP server as bundle sharing mechanism.

Details below the fold, in FAQ format (sort of).

TL;DR: Take this script and run it, for it is our WGS processing production workflow (uBAMs -> GVCF per-sample).


Read the whole post
See comments (0)



This morning, we unveiled an interactive GoogleMap, based on anonymized IP addresses collected from the forum database, that shows how the GATK user community is distributed across the globe. Check out Boston/Cambridge!

For the record, this was originally inspired by the World Map of High-throughput Sequencers by James Hadfield (Cancer Research UK, Cambridge) and Nick Loman (University of Birmingham).

As several people have already expressed interest in how this map was put together, I thought I'd give a brief overview of the technical side below the fold. I'm happy to provide more details and/or code if anyone wants to do something similar.


Read the whole post
See comments (1)



For largely practical reasons, the GATK website home URL has become http://software.broadinstitute.org/gatk. Don't worry, your bookmarked www links will still work foreveeeer -- at least that's what I'm told by our valiant IT folks. As always, let us know if you run into any trouble, not that we're expecting any.

See comments (0)



First, I hope those of you in the USA had a relaxing and/or exciting holiday weekend (happy birthday, 'Murica!). For the rest, we thank you for your patience as we recover from the festivities and work our way through the backlog of forum questions.

Now, I wanted to let you know that over the next few weeks, we're going to push out a variety of improvements to the GATK website and documentation contents. We start today with a main push that involves some structural changes that we think will improve the user experience overall and make it easier for new users in particular. Much of this is based on feedback we've received over the years, so hopefully we're following the will of the people!

We've done our best to avoid causing any disruptions for those of you who have been using our website for a long time, but we did have to move a few things around. Here are the highlights; if you have strong feelings about any of this (good or bad) let us know in the comments. Also let us know if you stumble across anything that looks broken and we'll fix it double quick.


Read the whole post
See comments (0)



We are streamlining our recommended workflows by removing a preprocessing step.

As announced in the GATK v3.6 highlights, variant calling workflows that use HaplotypeCaller or MuTect2 now omit indel realignment. This change does not apply to workflows that call variants with UnifiedGenotyper or the original MuTect. We still recommend indel realignment for these legacy workflows. Recall that indel realignment uses RealignerTargetCreator and IndelRealigner and comes after duplicate marking and before base quality score recalibration (BQSR).

In light of these changes, let’s take a brisk stroll through the implications for variant detection. In particular, let’s focus on insertion and deletion events (indels).


Read the whole post
See comments (0)



The presentation slide decks and hands-on tutorial materials can be downloaded at this Google Drive link.

See comments (0)



Latest posts
 

At a glance



Follow us on Twitter

GATK Dev Team

@gatk_dev

@BrianPardy Great, thanks for the feedback!
28 Sep 16
@BrianPardy Thank you! Does anything in particular stand out? We'd love to know what people find most useful so we can do more of the same.
28 Sep 16
#GATK workshop crew is in Basel, ready to roll! @ISBSIB https://t.co/m56JzpC1bN
25 Sep 16
@thatdnaguy That's right, we've retired it, see https://t.co/epbvwOQVTt
23 Sep 16
@geoffjentry @BroadGenomics Ah, you should ask @WDL_dev on the WDL forum then :)
21 Sep 16

Our favorite tweets from others

I've easily written my first custom ReadFilter for GATK. The @gatk_dev 's toolkit is a great example of programming.
21 Sep 16
@gatk_dev "make it so"
8 Sep 16
it's the nightly build owl for GATK :D https://t.co/OwTRrk6KHA https://t.co/rfmAbdIIQp
11 Aug 16
We're going to make an hg38 version of ExAC. And we'll make @dgmacarthur pay for it. #BioinformaticsCampaignPromises
2 Aug 16
You’re right @gatk_dev honesty is key! About variants manual filtering: “In any case you're probably in for a world of pain.” Ha now I know!
11 Jul 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg benchmarks best-practices bug bug-fixed cancer catvariants cloud cluster commandline commandlinegatk community compute conferences cram cromwell denovo depthofcoverage diagnosetargets error fix forum gatk3 gatk4 genotype genotype-refinement genotypegvcfs google grch38 gvcf haploid haplotypecaller hg38 holiday hts htsjdk ibm java8 job job-offer jobs license meetings mendelianviolations multithreading mutect mutect2 ngs nt outreach pairhmm parallelism patch performance phone-home picard pipeline plans ploidy polyploid poster presentations printreads profile promote reference-model release release-notes rnaseq runtime saas script search selectvariants sequencing service slides snow speed status sting support syntax talks team terminology third-party-tools topstory trivia troll tutorial unifiedgenotyper variantannotator variantrecalibrator vcf-gz version-highlights versions vqsr wdl webinar workflow workshop