One of my favorite features of the Cromwell workflow management system is that it was designed from the start to "support multiple computing platforms in order to maximize portability and reproducibility of analysis workflows". In my role as an advocate for a research community that struggles with standardization and interoperability, just reading those words in a sentence makes me perk up like a kid hearing the ice cream truck song. But then it immediately raises the question -- what does it mean for Cromwell to support a given platform? Let's take a moment today to unpack that, since "support" is kind of a loaded term in the world of workflow management systems.

In principle, Cromwell will happily run on any machine using its preferred version of Java, and… well, that's about it for technical requirements. Under those terms, you could run it on a VM on any cloud platform you like; spin up a bunch of nodes, set up LSF on them, and run, say, Cromwell-on-LSF-on-AWS. But if you want to take advantage of the true power of that platform, you have to manage a whole set of additional layers: access to object storage, containers, authentication and so on. That's a lot of work, and for many of our users it's not a realistic option. It's certainly not my idea of a fun time.

The good news is that when we say Cromwell supports a given platform, we mean it will manage all of that for you. That's where the concept of a backend comes in; it's essentially the plugin adapter that allows Cromwell to talk directly with the various components of the platform you want to run on, and orchestrate all those operations seamlessly to get the job done. You just need to give it the right configuration file -- and naturally we provide templates for all supported platforms. Ultimately, our goal is to provide a seamless computing experience that minimizes setup and maximizes throughput, so you can just get on with the interesting part of your work. This applies to all platforms; whether you choose to use cloud resources, a local HPC cluster, or both -- Cromwell's job is to empower you to be more productive by making the pipelining process easy and scalable.

That being said, someone still has to write the backend for each platform we want to support. In the case of cloud platforms, that requires expert-level understanding of things like how the resource allocation system operates, which can be quite challenging to acquire. Thankfully, we don't have to do it all ourselves -- Cromwell is an open-source project, and benefits from many contributions made by external developers. That includes experts who know specific systems inside out, sometimes because they helped build them! Cromwell's cloud backends are a great example of this, having been produced primarily by engineers from their respective cloud companies. We are deeply indebted to their advice, code and collaboration; we hope others will be inspired to contribute their own backends for other platforms and thus further extend the effortless flexibility of Cromwell to the greater research community.

Return to top

Comment on this article

- Recent posts

- Upcoming events

See Events calendar for full list and dates

- Recent events

See Events calendar for full list and dates

- Follow us on Twitter

WDL Dev Team


RT @TristanNaumann: Very cool to see Cromwell on Azure 1.0.0 released on GitHub!
6 Nov 19
RT @hcaskey: Earlier this month, we released Cromwell on Azure, an open-source project on GitHub from Microsoft Genomics that provides scie…
6 Nov 19
@dbernick @fdmts java -Dconfig.file=google.conf -jar cromwell-47.jar run hello.wdl -i hello.inputs
26 Oct 19
Featuring WDL and Cromwell on @Azure thanks to @Microsoft @Health_IT -- see also our blog at…
16 Oct 19
Blue skies ahead for Cromwell on Azure!
15 Oct 19

- Our favorite tweets from others

To all the workflow developers out there, we’ve added support for @WDL_dev 1.0. Check out our getting started doc h…
28 Oct 19
@tangming2005 @geoffjentry @dnanexus I was surprised at the cost too, most user group meetings in biotech are free/…
9 Sep 19
Nice overview of bioinformatics workflows definition and execution including discussion of @commonwl @galaxyproject
3 Sep 19
Workflow systems turn raw data into scientific knowledge
2 Sep 19
@DNAstack @DTSupercluster Ok, so now I'm expecting this many more @WDL_dev PRs :P
2 Aug 19

See more of our favorite tweets...