UPDATE: This issue described below has been resolved.
Due to an individual user's submission that amounts to a very large number of jobs (~60k), all new workflow submissions are currently being held in the queue (with status
QueuedInCromwell). To be clear, as far as we can tell this is NOT a FireCloud malfunction; it seems to be a Google Cloud limitation that we are encountering for the first time. We are working with GCP support and evaluating options to unblock the queue, hopefully without interrupting that one very ambitious and totally legitimate submission. We will strive to resume normal workflow throughput by Monday morning EST.
We understand that this is causing many of you considerable inconvenience, yet we are hopeful that this case will provide an opportunity to push back the current limitations to the next level. Please remember that what we are all doing here, together, is blazing a new trail; building a new model for how we do science at scale, collaboratively. The fact that these scaling problems are arising at all demonstrates that we are on the right path, that the research community needs this level of scalability. And we will do everything in our power to deliver it.
Thank you for your patience and stay tuned for updates.
Broad Institute’s Genomics Platform & Data Science Platform announce the general availability of the FireCloud DataShuttle 0.1.1. The FireCloud DataShuttle allows users to easily browse files, download and upload data directly between FireCloud workspaces & Google buckets and your local drives, and monitor the status of these transfers.
The FireCloud DataShuttle was developed to facilitate the work of researchers and project managers who transfer a high volume of files and desire a more efficient and clearer process.
Exciting update! We are now able to offer all new FireCloud users $300 in free credits thanks to additional funding from the National Cancer Institute (NCI) along with funding from our friends at the Google Cloud platform.
We offer these free credits so that you can try out FireCloud without any billing setup and at no cost. All you have to do is register for a FireCloud account. Once you register and log into FireCloud, you will see a banner welcoming you to start your free trial. When you are ready to do analyses, click Start trial to kick off a sixty day window in which you can use your credits. The trial ends after sixty days or if you hit the $300 budget, whichever comes first. Read more information, FAQs, and the Terms and Conditions, here.
FireCloud is free to use, however, any activity that uses Google Cloud resources comes with a cost. The free credits will enable you to create a workspace, launch an analysis, upload or download files from the Google cloud, as well as create Jupyter notebooks to interact with your data on the fly. You can clone the GATK4 Featured Workspaces and try them out or upload your own workflow script (WDL) and run it on your own data.
All the credits are managed in a FireCloud billing project. You can see how much you've spent by navigating to the Summary tab of any of your workspaces and clicking the link below Google Billing detail under the Project Cost section. Our credit managers, called Onix, will also be sending you email notifications when you've spent certain amounts and will help you transition off the billing project.
As always, if you have any questions, please let us know on the forum.
Note: If you signed up for the trial before April 18th, but have not started it, we automatically increased your credit limit for you.
Within every featured workspace description is a subsection providing the estimated time and cost for running a method using the sequence data in the data model. These results are gathered by obtaining the time and cost of a completed workflow using a combination of FireCloud’s built in monitoring feature and Google’s BigQuery service. This document will briefly describe how the time and cost for these results are obtained and provide a link to a walkthrough for users interested using the same approach.
The simpler of the two approaches is obtaining the duration time of an executed workflow, which is retrieved from the monitor tab within the workspace submission page. Each submitted workflow has some brief information, which includes Submitted, Started, and End time. Often submitted and start time are identical but they should not be confused. The submitted time is when the workflow was initiated (the “Launch Workflow” icon was clicked), and the start time is when the workflow has reserved a virtual machine (VM) and begins running tasks. Between the submission and start time there might be a delay due to the workflow being on queue caused by a high volume of users, network lag, or some other reason. Thus, the duration is calculated using the start and end time listed for each submitted workflow.
The cost for running a workflow is obtained using BigQuery. BigQuery is a free GCloud service that you can think of as a search tool that provides metadata related to a submitted workflow. We can use BigQuery to search through google's database for a submitted workflow to show specific details for that job using a workflow id.
Once a workflow is executed a workflow id is created for that particular run, this unique ID identifies the workflow from the hundreds to thousands of other workflows being executed in FireCloud. This ID along with other billing related IDs is used to perform a search in BigQuery. The results from the search generates a tsv file with each row being a resource usage (e.g. compute, network, use of preemptible) and the columns are descriptions of the resource such as the cost for a particular resource. This tsv file can be downloaded to google sheet or local excel sheet and the sum of the cost column can be calculated giving the total cost of resources within the workflow.
From time to time, FireCloud displays a popup indicating that the application is undergoing planned maintenance or is otherwise unavailable.
In this release of FireCloud, we have replaced the popup with a banner at the top of the screen. The banner behaves just like the popup used to - it appears automatically when the FireCloud web application cannot talk to its API server, and immediately disappears when the server is available again.
Additionally, you may notice that a screen you see when FireCloud is unavailable ("Error loading user information. Please try again later.") now displays a brief diagnostic message and status code. This information is designed to help the FireCloud team restore service as quickly as possible.
Error: WDL parsing failure. ERROR: Finished parsing without consuming all tokens <text from line it found error in>
FireCloud has so far focused on making it easy to share data and execute pipelines, for all your batch-style --one might even say boring-- data processing and analysis needs. That covers a lot of territory; certainly most of the upstream work that is done under the umbrella of "genomic analysis". But for many of you, the end of the pipeline is just the start of the really interesting part; and whether you're moving on to GWAS or something else, the point is that you're moving on -- specifically, to a phase of analysis where you need to be able to interrogate your data interactively. And until now, that often meant downloading your pipeline's outputs and getting back to traditional, on-premises computational work and its limitations.
So today, we're excited to release a beta preview of a new FireCloud feature called Notebooks that makes it possible to run interactive analyses on your data in the cloud, with the convenience of a Jupyter notebook environment.
We are planning to release an important update that will change how the system identifies user access permissions to Google buckets in order to support a technique that renders certain workflows substantially more cost-effective.
This update is scheduled for next Tuesday, February 6 around 4:00 pm EST. At that time, all running workflows will fail and will be automatically restarted using call-caching once the update is complete. We recommend you hold off on launching any long-running workflows until after the release if you want to avoid having your workflow stopped and restarted.
This update will break some functionality if you have data stored in private buckets that are NOT FireCloud workspace buckets. It also deprecates the use of private docker images in Docker Hub, which remain functional for now but will be disabled in 30+ days (exact date will be communicated separately). If you think either applies to you (or might apply to you in the future), please read on to learn more about what is going to change, why we are making this change and what you need to do to resolve any problems that arise for you.
If you're in a hurry, feel free to zip straight through to the last section, but I think you'll find the instructions will make more sense if you read the full explanation.