HaplotypeCallerSpark crashes:8 parallel GATk jobs (8 samples) using 5 Spark cpus on each in a node with 40 cpus
open | Created 2019-02-25 | Last updated 2019-02-26| Posted by bhanugandham | See in Github


HaplotypeCallerSpark Spark Vanilla bug


User Question: I'm trying to speed up the process of calling variants using SPARK. I have access to a slurm HPC cluster, so I guess it's not that straightforward to run GATK in a proper distributed master-slave architecture (if there is any tutorial on how to setup slurm jobs to use GATK Spark tools on multiple nodes I would appreciate it a lot).
Therefore, I run GATK in local mode with some SPARK threads, eventually speeding up the process by parallelising the number of samples processed simultaneously with GNU parallel. But then, I'm having troubles because some samples crash due to SPARK errors. Perhaps you could send my logs to the developers ? I'm trying to run 8 parallel GATk jobs (8 samples) using 5 Spark cpus on each in a node with 40 cpus.

Best,
Pedro

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/comment/56193#Comment_56193


Return to top