The term "workshop" is used all over the place to describe very different things. In the GATK world, a workshop is a multi-day course that includes both lectures and hands-on exercises, interleaved to provide a well-balanced learning experience.
Our standard 4-day bootcamp-style workshop, described below, covers basic genomics, all currently supported Best Practices pipelines as well as pipelining with WDL/Cromwell/Terra. Other formulas may be available upon request.
The workshop focuses on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. Participants will learn why each step is essential to the variant discovery process, what are the operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of their dataset. In the course of this workshop, we highlight key functionalities such as the GVCF workflow for joint discovery of germline short variants in cohorts, somatic short variant discovery using Mutect2, and somatic copy number variation discovery using GATK-CNV. We also exercise the use of pipelining tools to assemble and execute GATK workflows.
In the hands-on sessions focused on analysis, we walk participants through exercises that teach them how to manipulate the standard data formats involved in variant discovery and how to apply GATK tools appropriately to common use cases and data types. In the course of these exercises, we demonstrate useful tips and tricks for interacting with GATK and Picard tools, dealing with problems, and using third-party tools such as IGV and RStudio.
In the optional hands-on sessions on pipelining, we walk participants through exercises that teach them to write workflow scripts using WDL, the Broad's new Workflow Description Language, and to execute these workflows locally with Cromwell as well as through FireCloud, our publicly available, secure cloud-based analysis service.
This workshop is aimed at a mixed audience of people who are new to the topic of variant discovery or to GATK, seeking an introductory course into the tools, or who are already GATK users seeking to improve their understanding of and proficiency with the tools. Participants should already be familiar with the basic terms and concepts of genetics and genomics. Basic familiarity with the command line environment is required.
Participants will be expected to bring their own laptops with software preinstalled (detailed instructions here) unless the workshop host provides a computer lab. Note that we do the majority of the hands-on exercises on our cloud platform, Terra, which allows us to focus on the scientific aspects of the work and eliminate common sources of technical difficulties. However, we still have a few exercises that we need to run locally, so it's very important that everyone follow the installation instructions. This also ensures that were we to experience any connectivity issues, we'd be able to fall back on using local machines. We encourage everyone to use Chrome as their web browser. Supported systems are Mac and Unix/Linux systems; MS Windows is NOT supported. Chromebooks are also not supported as they are not capable of running downloadable software.
Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to non-human data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.