Queue Workflow Engine
Queue Workflow Engine
Genome STRiP uses the Queue workflow engine to implement its executable pipelines that can take advantage of large compute clusters for parallel processing.
The Queue workflow engine was originally written by the Broad Institute's GSA group as part of the GATK software.
To use Queue, you first write a Queue script (which is written in a stylized version of the Scala language) and then run this script through the Queue command line interface. The Queue script articulates the processing steps and the dependencies between them. The Queue runtime engine schedules each processing step on a workflow manager (such as LSF or SGE, see below) and monitors the progress of the computation. When a processing step completes successfully, dependent processing steps can be queued for execution. If a processing step fails, other dependent steps will not be run, but other parts of the pipeline can continue. Queue tracks which steps have succeeded and which have failed, so you can "retry" an entire pipeline and only the failed steps (and any dependent steps) will be retried.
Queue does not use a centralized server or database to track the status of a pipeline run. Instead, it uses small sentinel files (".done" and ".fail" files), to track the input and output files of each processing step. The Queue program itself runs on the command line and monitors the status of the pipeline while it is running. In Genome STRiP, we generally write all of our pipelines to use a specific run directory which will contain all of the outputs and intermediate files, as well as the files Queue uses to track the pipeline status. Each run directory should be dedicated to the use of exactly one Queue pipeline execution. If you attempt to run multiple Genome STRiP pipelines in the same run directory (either concurrently or sequentially), this will not work due to conflicts over the intermediate files and the progress tracking files.
For detailed documentation on running Queue scripts, see the documentation for QCommandLine.