QCommandLine documentation

QCommandLine

Genome STRiP uses the Queue workflow engine (developed by the Broad Institute GSA group) to implement executable analysis pipelines.

For an overview of Queue, see Queue Workflow Engine.

Queue is invoked using the QCommandLine program. This page documents the common arguments to the Queue workflow engine itself (or similar arguments affecting the workflow execution environment that are common to all Genome STRiP pipelines). Not all Queue options or arguments are documented here, just the most common ones used with Genome STRiP.

Running a Queue script

Queue manages processing steps and dependencies, but generally does not directly execute the processing steps itself. Instead the processing steps are scheduled to run on a compute cluster through a workflow manager, such as LSF (by Platform Computing) or SGE (Sun Grid Engine).

A Queue script invocation for Genome STRiP generaly has the following form:

 classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar"
 java -Xmx4g -cp ${classpath} \
     org.broadinstitute.gatk.queue.QCommandLine \
     -S ${SV_DIR}/qscript/SVQScript.q
     -S pipeline_script_to_run.q \
     -cp ${classpath} \
     -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
     -jobProject project_name \
     -jobQueue queue_name \
     -jobLogDir log_directory \
     -jobRunner workflow_manager \
     -gatkJobRunner workflow_manager \
     -jobNative workflow_manager_options \
     -run

Note that -cp needs to be specified twice. Once to the java runtime (before QCommandLine) and once as an argument to QCommandLine (the latter controls the classpath that will be used for the individual processing steps).

The -run flag causes the processing steps to be executed. If you omit the -run flag, then Queue performs a "dry run" where it prints information about the processing steps that would have been run, but does not actually execute them. This can be used for debugging and can also be used to tell if a pipeline successfully run to completion.

Supported workflow managers

Queue has built-in support for LSF 7. For other workflow managers, you should generally try to use the Drmaa interface (Drmaa is a portable interface for workflow managers in the style of LSF and SGE).

Queue (and the Genome STRiP Queue scripts) have some built-in options that understand how to pass certain arguments to LSF, but it is generally better practice to use the Drmaa job runner and use -jobNative to pass the necessary arguments to your workflow manager.

  • LSF 7.0
  • To use LSF 7, specify -jobRunner Lsf706.

  • LSF 9.x
  • To use LSF 9, specify -jobRunner Drmaa from an LSF execution host configured to use LSF 9.

  • SGE
  • To use SGE, specify -jobRunner Drmaa from an SGE execution host.


QCommandLine specific arguments

Name Type Default value Summary
Required Parameters
-cp String NA The java classpath
-gatk File NA The path to the GenomeAnalysisTK.jar file
-S List[File] NA The queue script(s) to run
Optional Parameters
-gatkJobRunner String NA The workflow manager to use
-jobRunner String NA The workflow manager to use
-jobLogDir File NA Directory for output log files from Queue
-jobNative List[String] NA Options to be passed to the underlying workflow manager with each job
-jobProject String NA The name of the project associated with this workflow
-jobQueue String NA The execution queue to use to run the workflow
Optional Flags
-disabpleJobReport Flag NA Pass this flag to disable job report creation
-run Flag NA Pass this flag to execute the workflow, omitting this flag will do a dry run
Advanced Flags
-bsub Flag NA Set LSF 7.0.6 as the default job runner

Argument details

--bsub / -bsub ( Flag )

Set LSF 7.0.6 as the default job runner.

This flag overrides other settings to set LSF 7.0.6 as the default job runner. It is recommended not to use this flag with Genome STRiP, but instead specify the intended job runner explicitly using -jobRunner.

--classpath / -cp ( required String )

The java classpath.

--disableJobReport / -disabpleJobReport ( Flag )

Pass this flag to disable job report creation.

This argument is often handy to prevent execution problems due to contention over the Queue job report files.

The misspelling in the short form of the argument is not a documentation bug: it is a bug in the Queue code.

--gatkJarFile / -gatk ( required File )

The path to the GenomeAnalysisTK.jar file.

--gatkJobRunner / -gatkJobRunner ( String )

The workflow manager to use.

This argument is used only when Queue scripts invokes processing steps that are themselves Queue scripts (to work around an implementation limitation in Queue). You should set this argument to be the same as -jobRunner.

--job_runner / -jobRunner ( String )

The workflow manager to use.

This argument specifies the workflow manager that Queue should use to run the jobs within this workflow. See the introduction for the supported values.

--jobLogDirectory / -jobLogDir ( File )

Directory for output log files from Queue.

This directory is used to store log files from the parallel jobs run by Queue during execution of the discovery pipeline. The log files contain information that can be helpful for debugging or performance tuning.

If not supplied, the default is to use the current working directory. A typical strategy is to make a log directory underneath the run directory for each SVDiscovery run and direct the log files there.

--jobNative / -jobNative ( List[String] )

Options to be passed to the underlying workflow manager with each job.

This allows workflow-manager specific arguments to be passed to the underlying workflow manager. For example, -jobNative "-R rusage[mem=8192]" can be used to make a job resource request on LSF.

--jobProject / -jobProject ( String )

The name of the project associated with this workflow.

The project is used in some environments for accounting purposes.

--jobQueue / -jobQueue ( String )

The execution queue to use to run the workflow.

In many environments, there are different execution queues for LSF or SGE. This argument specifies the default execution queue for the jobs in the workflow.

--run_scripts / -run ( Flag )

Pass this flag to execute the workflow, omitting this flag will do a dry run.

If this flag is not specified, then Queue will perform a "dry run" of the workflow, determine which jobs need to be scheduled and print the command that would be executed if you were to specify -run.

--script / -S ( required List[File] )

The queue script(s) to run.

Specifies the Q scripts (scala scripts) to be read and executed by the workflow engine. Multiple input files can be specified, but exactly one of them should define the top-level script for execution.