It's finally summer here in New England -- time for cave-dwelling developers to hit the beach and do the lobster dance (those of us who don't tan well anyway). We leave you with a new version of the GATK that includes a new(ish) plotting tool, some more performance improvements to the callers, a lot of feature tweaks and quite a few bug fixes. Be sure to check out the full list in the 2.6 Release Notes.

Highlights are below as usual, enjoy. There's one thing that we need to point out with particular emphasis: we have moved to Java 7, so you may need to update your system's Java version. Full explanation at the end of this document because it's a little long, but be sure to read it.


New(ish) plotting tool for Base Recalibration results

GATK old-timers may remember a tool called AnalyzeCovariates, which was part of the BQSR process in 1.x versions, many moons ago. Well, we've resurrected it to take over the plotting functionality of the BaseRecalibrator, to make it easier and faster to plot and compare the results of base recalibration. This also prevents issues with plot generation in scatter-gather mode. We'll update our docs on the BQSR workflow in the next few days, but in the meantime you can find full details of how to use this tool here.


HaplotypeCaller now so sensitive, it cries at the movies

We know you don't want to miss a single true variant, so for this release, we've put a lot of effort into making the HaplotypeCaller more sensitive. And it's paying off: in our tests, the HaplotypeCaller is now more sensitive than the UnifiedGenotyper for calling both SNPs and indels when run over whole genome datasets.

[graph to illustrate, coming soon]


UnifiedGenotyper: not out of the race yet

You might think all our focus is on improving the HaplotypeCaller these days; you would be wrong. The UnifiedGenotyper is still essential for calling large numbers of samples together, for dealing with exotic ploidies, and for calling pooled samples. So we've given it a turbo boost that makes it go twice as fast for calling indels on multiple samples.

The key change here is the updated Hidden Markov Model used by the UG. You can see on the graph that as the number of exomes being called jointly increases, the new HMM keeps runtimes down significantly compared to the old HMM.


Version tracking in the VCF header

Don’t you hate it when you go back to a VCF you generated some months ago, and you have no idea which version of GATK you used at the time? (And yes, versions matter. Sometimes a lot.) We sure do, so we added a function to add the GATK version number in the header of the VCFs generated by GATK.


Migration to Java 7

Speaking of software versions... As you probably know, the GATK runs on Java -- specifically, until now, version 6 of the Runtime Environment (which translates to version 1.6 if you ask java -version at the command prompt). But the Java language has been evolving under our feet; version 7 has been out and stable for some time now, and version 8 is on the horizon. We were happy as clams with Java 6… but now, newer computers with recent OS versions ship with Java 7, and on MacOS X once you update the system it is difficult to go back to using Java 6. And since Java 7 is not fully backwards compatible, people have been running into version problems.

So, we have made the difficult but necessary decision to follow the tide, and migrate the GATK to Java 7. Starting with this release, GATK will now require Java 7 to run. If you try to run with Java 6, you will probably get an error like this:

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/broadinstitute/sting/gatk/CommandLineGATK : Unsupported major.minor version 51.0

If you're not sure what version of Java you are currently using, you can find out very easily by typing the following command:

java -version

which should return something like this:

java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

If not, you'll need to update your java version. If you have any difficulty doing this, please don’t ask us in the forum -- you’ll get much better, faster help if you ask your local IT department.


Return to top

mike


Hi, It seems that the new release has bug: if I run the following, I got error java -Xmx2g -jar /opt/nasapps/development/gatk/2.6-4/bin/GenomeAnalysisTK.jar -T RealignerTargetCreator -h Exception in thread "main" java.lang.UnsupportedClassVersionError: org/broadinstitute/sting/gatk/CommandLineGATK : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: org.broadinstitute.sting.gatk.CommandLineGATK. Program will exit. But if I run: java -Xmx2g -cp /opt/nasapps/development/gatk/2.6-4/bin/GenomeAnalysisTK.jar -help, which seem fine. then I run the following: java -Xmx2g -cp /opt/nasapps/development/gatk/2.6-4/bin/GenomeAnalysisTK.jar -T RealignerTargetCreator -h Unrecognized option: -T Could not create the Java virtual machine. Either way, I can not use the new build 2.6-4. Could you check on that Thanks Mike

Mon 24 Jun 2013

mike


just additional info, if I used an old version, I can perfectly run the following without error: java -Xmx2g -jar /opt/nasapps/development/gatk/2.5-2/bin/GenomeAnalysisTK.jar -T RealignerTargetCreator -h Thx, Mike

Mon 24 Jun 2013

Geraldine_VdAuwera


@mike, that's the error you get when you have an older java version. Please see the text of the announcement relating to the Java version upgrade.

Mon 24 Jun 2013

mike


Hi, Geraldine: Thanks a lot for pointing that out. Yes, with Java 7, it works! Thanks again and best Mike

Mon 24 Jun 2013

karthisivaraman


I am in a quandry. I use Mac OS X 10.6.8, with Java 1.6.0_51. Unfortunately, I cannot upgrade to Java 7, since there is some compatibility issue. Also unfortunately, I cannot run GATK 2.6 because it won't run with Java 1.6. Is there a legacy version that I can access to run in my mac? Even GATK 1.6 should do. Or an archived copy of GATK-lite? Thanks.

Mon 24 Jun 2013

Geraldine_VdAuwera


You can get the 2.5 version from our [github repository](https://github.com/broadgsa/gatk-protected/tree/2.5), but you'll have to build from source. Are you sure you can't upgrade your MacOS and Java? As time goes on you will not be able to use any of the new improvements, and we won't be able to provide support for running an old version.

Mon 24 Jun 2013

chrismit


So, is there anywhere the false negative rate and false positive rates for the HaplotypeCaller are listed? I don't need a graph, but some ballpark number would be appreciated.

Mon 24 Jun 2013

Geraldine_VdAuwera


I don't have the numbers on hand but I'll try to dig them up on Monday. You'll be happy to hear that starting with the next release (2.7) we will systematically issue the numbers for every new version.

Mon 24 Jun 2013





At a glance



Follow us on Twitter

GATK Dev Team

@gatk_dev

@BrianPardy Great, thanks for the feedback!
28 Sep 16
@BrianPardy Thank you! Does anything in particular stand out? We'd love to know what people find most useful so we can do more of the same.
28 Sep 16
#GATK workshop crew is in Basel, ready to roll! @ISBSIB https://t.co/m56JzpC1bN
25 Sep 16
@thatdnaguy That's right, we've retired it, see https://t.co/epbvwOQVTt
23 Sep 16
@geoffjentry @BroadGenomics Ah, you should ask @WDL_dev on the WDL forum then :)
21 Sep 16

Our favorite tweets from others

I've easily written my first custom ReadFilter for GATK. The @gatk_dev 's toolkit is a great example of programming.
21 Sep 16
@gatk_dev "make it so"
8 Sep 16
it's the nightly build owl for GATK :D https://t.co/OwTRrk6KHA https://t.co/rfmAbdIIQp
11 Aug 16
We're going to make an hg38 version of ExAC. And we'll make @dgmacarthur pay for it. #BioinformaticsCampaignPromises
2 Aug 16
You’re right @gatk_dev honesty is key! About variants manual filtering: “In any case you're probably in for a world of pain.” Ha now I know!
11 Jul 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg benchmarks best-practices bug bug-fixed cancer catvariants cloud cluster commandline commandlinegatk community compute conferences cram cromwell denovo depthofcoverage diagnosetargets error fix forum gatk3 gatk4 genotype genotype-refinement genotypegvcfs google grch38 gvcf haploid haplotypecaller hg38 holiday hts htsjdk ibm java8 job job-offer jobs license meetings mendelianviolations multithreading mutect mutect2 ngs nt outreach pairhmm parallelism patch performance phone-home picard pipeline plans ploidy polyploid poster presentations printreads profile promote reference-model release release-notes rnaseq runtime saas script search selectvariants sequencing service slides snow speed status sting support syntax talks team terminology third-party-tools topstory trivia troll tutorial unifiedgenotyper variantannotator variantrecalibrator vcf-gz version-highlights versions vqsr wdl webinar workflow workshop