Computing Platforms
Recommended environments and services


We aim to provide the research community with a range of options for running our Best Practices workflows exactly the same way we do it in-house at the Broad Institute. To that end, we make all our workflow scripts available publicly in Github under a dedicated organization called gatk-workflows, and we provide Docker containers for all versions of GATK (since 2018) in DockerHub. See the Best Practices to browse the pipelines by use case.

GATK's preferred pipelining solution: WDL + Cromwell

Our workflows are written in WDL, a user-friendly scripting language maintained by the OpenWDL community. Cromwell is an open-source workflow execution engine that supports WDL as well as CWL, the Common Workflow Language, and can be run on a variety of different platforms, both local and cloud-based. We take advantage of Cromwell's flexibility in our own work: we do some of our initial development work on the Broad's UGER cluster, then we run at scale on Google Cloud. This allows us to run exactly the same scripts regardless of the compute environment.


Comments (0)

Platform options for running GATK

We work with multiple industry partners to provide services for running our pipelines more easily, whether on local computing infrastructures or on public cloud platforms. For some background information on this effort, please see this 2016 announcement.

Terra

GATK on the cloud with Terra

An open platform for accessing and analyzing data securely and at scale. Built on top of Google Cloud in collaboration with Verily Life Sciences and operated by the Broad Institute.

 

Intel

GATK on HPC with Intel BIGStack

Reference architectures and Cromwell-based appliance solutions for on-premises computing. Developed and serviced by Intel.