Feature request: Dirt simple error message
open | Created 2018-01-31 | Last updated 2019-02-21| Posted by davidbenjamin | See in Github


Logs Needs Triage User Requested Improvement


If a workflow fails, I would like cromwell to tell me somewhere really obvious, like perhaps the last thing it prints to stdout:

  • The first thing that failed.
  • Whether it was a failure that occurred while cromwell was running the wdl, or whether it occurred during the execution of a task.
  • If a cromwell error, which line of which wdl it happened on.
  • If a task, which task and the stderr file.

I think a lot of this already exists, but I'm suggesting for it to be super-simple and in one place.

Here's an example of the desired output:

Lots of output. . .
. . .
Your workflow failed while executing task HelloWorld,  See cromwell-executions/Hello/c44566iifgg57/call-HelloWorld/execution/stderr for details.

Or

Cromwell failed while executing line 346 of HelloWorld.wdl.  The index 6 is out of bounds for the array popular_salutations.

This would be extremely helpful whenever a non-expert runs a workflow, for example our mutect2 wdl. Currently I debug with a mix of home-brewed greps through the cromwell metadata, fishing through cromwell-executions, and running the darn thing myself. It would be really great just to tell them to look at the last line of stdout.

A few points:

  • The first error is all you need because you can iterate and fix bugs one at a time.
  • It's crucial to let the user know very explicitly if this is a cromwell error or a within-task error.
  • Stack traces from running the tool could be useful and acceptable, but cromwell stack traces with all that stuff about akka and spark would be overwhelming.
  • Rigor isn't important here. For example, the order of execution is not prescribed to the point that the first task to fail will be the same every time, but for this it doesn't matter. Just reporting the first error on the wall clock is sufficient.
  • It doesn't matter which workflows, subworkflows etc fails. Just the task.
  • Putting this on the last line of stdout has the added virtue of avoiding hassle in a screen session.

Return to top