GATK 4 will be the next major version of GATK, bringing together well-established tools from the GATK and Picard codebases under a single simplified, streamlined framework, and enabling selected tools to be run in a massively parallel way on local clusters or in the cloud using Apache Spark. This package contains the latest alpha preview release.
This project is in an alpha development stage and is not yet ready for general use. Available documentation can be found in the GATK 4 Alpha forum and in the source code repositories (see further below).
All POSIX operating systems (Unix, Linux, MacOSX etc) are supported. Microsoft Windows is not supported. The current version requires Java 8. Note that the Oracle Java is preferred; OpenJDK is not officially supported. A few tools require R 3.1.3 (mainly for plotting). An Rscript is available here to install any R libraries not already present on your system. This Rscript can be invoked as follows:
Rscript install_R_packages.R. There is a frontend launching script provided for convenience that requires Python version 2.6 or greater.
The GATK 4 Alpha package is made available for free to academic researchers under a limited license for non-commercial use. Note that GATK 4 is not currently available for commercial use due to its alpha status.
The source code for the core GATK tools is available in the broadinstitute/gatk-protected repository on Github. The source code for the GATK 4 engine, which is open-source, is available separately in the broadinstitute/gatk repository on Github. This includes the engine, infrastructure libraries and many utility tools. It is intended for developers who wish to develop their own tools, and is freely available to all under a BSD license.