Latest posts
 


UPDATE: This issue described below has been resolved.

Due to an individual user's submission that amounts to a very large number of jobs (~60k), all new workflow submissions are currently being held in the queue (with status QueuedInCromwell). To be clear, as far as we can tell this is NOT a FireCloud malfunction; it seems to be a Google Cloud limitation that we are encountering for the first time. We are working with GCP support and evaluating options to unblock the queue, hopefully without interrupting that one very ambitious and totally legitimate submission. We will strive to resume normal workflow throughput by Monday morning EST.

We understand that this is causing many of you considerable inconvenience, yet we are hopeful that this case will provide an opportunity to push back the current limitations to the next level. Please remember that what we are all doing here, together, is blazing a new trail; building a new model for how we do science at scale, collaboratively. The fact that these scaling problems are arising at all demonstrates that we are on the right path, that the research community needs this level of scalability. And we will do everything in our power to deliver it.

Thank you for your patience and stay tuned for updates.

See comments (4)


FireCloud DataShuttle Release

Posted by KateN on 17 May 2018 (1)


Alpha version 0.1.1

Release Overview

Broad Institute’s Genomics Platform & Data Science Platform announce the general availability of the FireCloud DataShuttle 0.1.1. The FireCloud DataShuttle allows users to easily browse files, download and upload data directly between FireCloud workspaces & Google buckets and your local drives, and monitor the status of these transfers.

The FireCloud DataShuttle was developed to facilitate the work of researchers and project managers who transfer a high volume of files and desire a more efficient and clearer process.


Read the whole post
See comments (1)


Release Notes: May 2018

Posted by KateN on 1 May 2018 (0)


May 22, 2018

  • When viewing a single workflow, FireCloud now allows you to drill down into the details of subworkflows.
  • When viewing a single submission, FireCloud now shows actual cloud costs for that submission and each workflow in the submission, when available. Cost information will be added to additional parts of the UI in upcoming releases.

Read the whole post
See comments (0)



Exciting update! We are now able to offer all new FireCloud users $300 in free credits thanks to additional funding from the National Cancer Institute (NCI) along with funding from our friends at the Google Cloud platform.

We offer these free credits so that you can try out FireCloud without any billing setup and at no cost. All you have to do is register for a FireCloud account. Once you register and log into FireCloud, you will see a banner welcoming you to start your free trial. When you are ready to do analyses, click Start trial to kick off a sixty day window in which you can use your credits. The trial ends after sixty days or if you hit the $300 budget, whichever comes first. Read more information, FAQs, and the Terms and Conditions, here.

FireCloud is free to use, however, any activity that uses Google Cloud resources comes with a cost. The free credits will enable you to create a workspace, launch an analysis, upload or download files from the Google cloud, as well as create Jupyter notebooks to interact with your data on the fly. You can clone the GATK4 Featured Workspaces and try them out or upload your own workflow script (WDL) and run it on your own data.

All the credits are managed in a FireCloud billing project. You can see how much you've spent by navigating to the Summary tab of any of your workspaces and clicking the link below Google Billing detail under the Project Cost section. Our credit managers, called Onix, will also be sending you email notifications when you've spent certain amounts and will help you transition off the billing project.

As always, if you have any questions, please let us know on the forum.

Note: If you signed up for the trial before April 18th, but have not started it, we automatically increased your credit limit for you.

See comments (0)



How do I retrieve the time and cost of my workflow?

Within every featured workspace description is a subsection providing the estimated time and cost for running a method using the sequence data in the data model. These results are gathered by obtaining the time and cost of a completed workflow using a combination of FireCloud’s built in monitoring feature and Google’s BigQuery service. This document will briefly describe how the time and cost for these results are obtained and provide a link to a walkthrough for users interested using the same approach.

Time

The simpler of the two approaches is obtaining the duration time of an executed workflow, which is retrieved from the monitor tab within the workspace submission page. Each submitted workflow has some brief information, which includes Submitted, Started, and End time. Often submitted and start time are identical but they should not be confused. The submitted time is when the workflow was initiated (the “Launch Workflow” icon was clicked), and the start time is when the workflow has reserved a virtual machine (VM) and begins running tasks. Between the submission and start time there might be a delay due to the workflow being on queue caused by a high volume of users, network lag, or some other reason. Thus, the duration is calculated using the start and end time listed for each submitted workflow.

Cost

The cost for running a workflow is obtained using BigQuery. BigQuery is a free GCloud service that you can think of as a search tool that provides metadata related to a submitted workflow. We can use BigQuery to search through google's database for a submitted workflow to show specific details for that job using a workflow id.

Once a workflow is executed a workflow id is created for that particular run, this unique ID identifies the workflow from the hundreds to thousands of other workflows being executed in FireCloud. This ID along with other billing related IDs is used to perform a search in BigQuery. The results from the search generates a tsv file with each row being a resource usage (e.g. compute, network, use of preemptible) and the columns are descriptions of the resource such as the cost for a particular resource. This tsv file can be downloaded to google sheet or local excel sheet and the sum of the cost column can be calculated giving the total cost of resources within the workflow.

Important Notes:

  • Workflow data on BigQuery is not available until some hours after the workflow has complete. Its best to allow a day to pass before querying the database.
  • The tsv includes a column for start and end time, but BigQuery has been unreliable thus far in terms of time so you are better off using the times in the workspace monitoring tab.
  • BigQuery does not automatically export workflow execution metadata used to make the tsv from your billing project, this feature has to be enabled in order to use the service.

Read the whole post
See comments (0)


Release Notes: April 2018

Posted by KateN on 4 Apr 2018 (0)


April 25, 2018

  • We fixed a bug preventing users from creating clusters.
  • You can now use R version 3.4 in your Notebooks.
  • You can now see FireCloud's status in the application by clicking the "FireCloud Status" link in the footer of any page.

April 24, 2018

  • When creating or cloning a workspace, you will now see that the dropdown for billing projects is in alphabetical order. (Previously, the projects were listed in arbitrary order.)
  • Added a new endpoint to Sam to get the access token for a user's pet service account.
  • The Swagger-UI interface here no longer requests OAuth scopes for Google Drive or Google Sheets
  • You can now localize data URLs in addition to google bucket paths in Jupyter Notebooks.
  • The Notebooks Dockerfile was unable to run; this has been fixed

April 18, 2018

  • Free trial: A newly registered FireCloud user is automatically enrolled in the free credit program and will see the banner to start the trial as soon as they sign-in. New users are now getting $300 instead of $250 in credits too. Read more about this on the blog.
  • Notebooks: You can now pause and resume clusters, which will help reduce your costs. The only cost is for storing the persistent disk in Google Cloud Storage, which is on the order of a single-digit $/month. The time taken to resume a cluster is a couple of seconds versus deleting it and spinning up a new cluster, which can take a couple of minutes.
  • Profile: When you fill out your profile in FireCloud, your contact email can now include a plus character, + , without throwing any errors.
  • Bug: Fixed a text rendering bug that could cause links stored in the Data tab or workspace attributes section to display their components out-of-order.
  • Bug: Fixed a UI bug in the workspace attribute section that occurs when an attribute has embedded quotes in it (Ex. "\"hello\"").
  • Bug: Reworked a piece of UI code that was causing undue load on the entire system.

April 10, 2018

  • On certain datasets, submitting an analysis could trigger large SQL queries, resulting in other users experiencing "Lock wait timeout exceeded" errors on their own submissions. These SQL queries are now optimized and this known cause of database locks should be gone. We continue to look for any other causes of locks.
  • From time to time, FireCloud displays a popup indicating that the application is undergoing planned maintenance or is otherwise unavailable.

    In this release of FireCloud, we have replaced the popup with a banner at the top of the screen. The banner behaves just like the popup used to - it appears automatically when the FireCloud web application cannot talk to its API server, and immediately disappears when the server is available again.

    Additionally, you may notice that a screen you see when FireCloud is unavailable ("Error loading user information. Please try again later.") now displays a brief diagnostic message and status code. This information is designed to help the FireCloud team restore service as quickly as possible.

  • The automatic WDL parser in the Create New Method dialog gave nondescript error messages in some cases. This made it confusing as to whether the error was something in the WDL or something to do with FireCloud. Now, we've updated the error message to better indicate WDL as the source: e.g. Error: WDL parsing failure. ERROR: Finished parsing without consuming all tokens <text from line it found error in>
  • Consent has a concept of empty votes, where a DAC member has not yet voted "Yes" or "No". These empty votes could cause Consent to encounter an error when displaying a summary of vote outcomes. This error is now resolved.
  • When logged in, you can now click the DUOS logo in the top-left corner of the screen to go back to the "Home" screen.
  • Resolved an issue where certain less-common line endings prevented FireCloud from correctly previewing TSVs when importing data into a workspace.
  • We now display an error message to you when there is an error during the consent match process. We have also improved the performance of the match process.
  • We have implemented a bug fix in DUOS to disallow invalid updates of a consent's data use letter GCS location.
  • In the last release we enabled subworkflow metadata. However, this has caused out-of-memory issues when expanding all subworkflows all the time. We are disabling this for now with a more permanent fix coming soon.

Read the whole post
See comments (0)



FireCloud has so far focused on making it easy to share data and execute pipelines, for all your batch-style --one might even say boring-- data processing and analysis needs. That covers a lot of territory; certainly most of the upstream work that is done under the umbrella of "genomic analysis". But for many of you, the end of the pipeline is just the start of the really interesting part; and whether you're moving on to GWAS or something else, the point is that you're moving on -- specifically, to a phase of analysis where you need to be able to interrogate your data interactively. And until now, that often meant downloading your pipeline's outputs and getting back to traditional, on-premises computational work and its limitations.

So today, we're excited to release a beta preview of a new FireCloud feature called Notebooks that makes it possible to run interactive analyses on your data in the cloud, with the convenience of a Jupyter notebook environment.


Read the whole post
See comments (0)


Release Notes: March 2018

Posted by KateN on 13 Mar 2018 (0)


March 28, 2018

  • FireCloud has updated to Cromwell version 31. Lately, you may have noticed FireCloud going down for frequent but short periods of time due a large number of analyses being launched at once. This upgrade improves the stability and reliability of the platform when large submissions are launched.
  • Method Configurations can now reference workflows in Dockstore.
  • The workspace API now supports Method Configurations for method repositories other than Agora. All changes are backwards-compatible with code that works with Agora Method Configurations; see the Swagger documentation of /api/workspaces/{workspaceNamespace}/{workspaceName}/methodconfigs (GET and POST methods) for details.

Read the whole post
See comments (0)


Release Notes: February 2018

Posted by KateN on 6 Feb 2018 (0)


February 26, 2018

  • Added a new API endpoint at /api/profile/importstatus. This API is a convenience for the FireCloud UI that allows the UI to consolidate multiple ajax calls into a single call. It indicates whether or not the current user has at least one writable workspace, and at least one working billing project.
  • The api.firecloud.org/status API now returns a 200 OK even when subsystems are down. This allows more accurate machine-readable inspection of the API's health. To inspect health of subsystems, read the response payload, which has not changed.

Read the whole post
See comments (0)



We are planning to release an important update that will change how the system identifies user access permissions to Google buckets in order to support a technique that renders certain workflows substantially more cost-effective.

This update is scheduled for next Tuesday, February 6 around 4:00 pm EST. At that time, all running workflows will fail and will be automatically restarted using call-caching once the update is complete. We recommend you hold off on launching any long-running workflows until after the release if you want to avoid having your workflow stopped and restarted.

This update will break some functionality if you have data stored in private buckets that are NOT FireCloud workspace buckets. It also deprecates the use of private docker images in Docker Hub, which remain functional for now but will be disabled in 30+ days (exact date will be communicated separately). If you think either applies to you (or might apply to you in the future), please read on to learn more about what is going to change, why we are making this change and what you need to do to resolve any problems that arise for you.

If you're in a hurry, feel free to zip straight through to the last section, but I think you'll find the instructions will make more sense if you read the full explanation.


Read the whole post
See comments (3)



Latest posts
 

- Recent posts



- Follow us on Twitter

FireCloud

@BroadFireCloud

RT @drzoomie: 5k samples processed in 2 days using @Trinity_CTAT + @BroadFireCloud $2 per sample! #transcriptomics #RNAseq #informatic…
23 May 18
RT @keesvanbochove: Geraldine van der Auwera @broadinstitute presenting @BroadFireCloud which interestingly (but not surprisingly) has almo…
17 May 18
RT @broadinstitute: “Making Data More FAIR on the Cloud.” Geraldine Van der Auwera on uses of @BroadFireCloud at #BioIT18. TODAY, 11:40am (…
17 May 18
Want to hear the latest on WDL, Cromwell, FireCloud, and GATK #BioIT18 ? See this blog for tomorrow's schedule of t… https://t.co/S7kf58tECu
16 May 18
#BioIT18 folks - come to booth #410 on 5/15 at 5:00 to learn about our $5 genome analysis pipeline (5 is clearly th… https://t.co/GH0zwrLlea
14 May 18

- Our favorite tweets from others

5k samples processed in 2 days using @Trinity_CTAT + @BroadFireCloud $2 per sample! #transcriptomics #RNAseq… https://t.co/fnqRdKC0xN
23 May 18
Geraldine van der Auwera @broadinstitute presenting @BroadFireCloud which interestingly (but not surprisingly) has… https://t.co/DOkBfNkUbb
17 May 18
“Making Data More FAIR on the Cloud.” Geraldine Van der Auwera on uses of @BroadFireCloud at #BioIT18. TODAY, 11:40… https://t.co/csJCfc5RII
17 May 18
Geraldine from @broadinstitute @gatk_dev is still at the @googlecloud #BioIT18 booth (#410). She’ll be talking abou… https://t.co/QRT0BLe7AX
15 May 18
on the @BroadFireCloud #bioinformatics https://t.co/m3pzDTkk6X
11 May 18

See more of our favorite tweets...